This version (2017/05/27 13:44) is a draft.
Approvals: 0/1

[11:06:28] <temporalfox> pmlopes cescoffier good morning :-)

[11:06:37] <pmlopes> good morning

[11:06:45] <cescoffier> morning !

[11:21:41] <andyhedges> I pretty sure I know the answer to this question. I'm going to ask anyway, should it always be safe to write a buffer to a http response using end(buff), when the buffer was created from a byte array.

[11:28:25] <temporalfox> andyhedges why could it be a problem ?

[11:29:50] <andyhedges> Because it is a problem ;) On AWS using Amazon's Linux it goes into an infinite pause

[11:30:12] <andyhedges> Works find on my Mac, works find on Azure with CentOS

[11:30:18] <andyhedges> Even works on Windows

[11:31:13] <andyhedges> but on AWS I can send it into an infinite pause fairly predicatably

[11:33:00] <andyhedges> 2015-06-03 08:03:20,315 341555 WARN [vertx-blocked-thread-checker] i.v.core.impl.BlockedThreadChecker - Thread Thread[vert.x-eventloop-thread-0,5,main] has been blocked for 162480 ms, time limit is 500

[11:33:02] <andyhedges> io.vertx.core.VertxException: Thread blocked

[11:33:04] <andyhedges> at io.vertx.core.http.impl.HttpServerResponseImpl.handleDrained(HttpServerResponseImpl.java:447) ~[[redacted].jar:na]

[11:33:06] <andyhedges> at io.vertx.core.http.impl.ServerConnection.handleInterestedOpsChanged(ServerConnection.java:295) ~[[redacted].jar:na]

[11:33:08] <andyhedges>

[11:33:45] <andyhedges> Any ideas welcome :)

[11:45:59] <andyhedges> Just confirms it works fine on RHEL on AWS too - so just Amazon's Linux on AWS - grrr

[11:46:06] <andyhedges> confirmed*

[11:48:49] <Sticky> I would do a thread dump and look for deadlocks/livelocks

[11:49:27] <andyhedges> Will take a look Sticky, thanks

[11:49:27] <temporalfox> andyhedges what if you clone the byte[] ?

[11:50:20] <andyhedges> so Buffer.buffer(b.clone())

[11:50:33] <temporalfox> to see what happens

[11:51:56] <temporalfox> at the end creating a Buffer from a String, calls getBytes() on the String

[11:57:43] <purplefox> andyhedges: the blocked thread checker should give you a stack trace telling you where the blocking is occurring

[11:59:19] <andyhedges> Yes, I pasted it above

[11:59:24] <andyhedges> it' in the handleDrained

[11:59:33] <andyhedges> the clone didn't improve matters btw

[11:59:52] <andyhedges> HttpServerResponseImpl.java:447

[11:59:57] <andyhedges> is where it is blocked

[12:00:24] <andyhedges> or more helpfully io.vertx.core.http.impl.HttpServerResponseImpl.handleDrained(HttpServerResponseImpl.java:447)

[12:05:04] <purplefox> andyhedges: can you show me the full stack - there's nothing in that method that blocks

[12:05:17] <purplefox> so unless you're doing a busy wait…

[12:05:33] <andyhedges> full stack from the BlockedThreadChecker?

[12:05:36] <purplefox> yes

[12:05:48] <andyhedges> I'll pastebin it, one sec

[12:07:49] <purplefox> and do you get the same stack each time?

[12:08:01] <andyhedges> Yes, same each time

[12:08:05] <andyhedges> http://pastebin.com/RdakrE7N

[12:08:35] <purplefox> andyhedges: what version are you using?

[12:08:51] <andyhedges> milestone6

[12:09:25] <purplefox> ok, the method is synchronized, so i suspect you have a deadlock

[12:09:32] <purplefox> can you do a killall -3 java when this occurs?

[12:09:37] <purplefox> this should give you more information

[12:09:42] <andyhedges> lemme try

[12:09:50] <andyhedges> what would it deadlock with, any idea?

[12:10:15] <purplefox> don't know, but the dump should tell us

[12:12:05] <andyhedges> Did the killall, was expecting output on standard out, but nothing

[12:12:41] <purplefox> killall -3 ?

[12:12:46] <andyhedges> yup

[12:12:59] <purplefox> what jdk are you using?

[12:13:40] <andyhedges> OpenJDK 1.8.0_45-b13

[12:14:03] <purplefox> weird, that should certainly work

[12:14:20] <purplefox> you could kill -3 <pid>

[12:14:29] <purplefox> where <pid> is the pid of the process

[12:14:37] <purplefox> or kill -QUIT (but that means the same thing)

[12:14:38] <andyhedges> will try

[12:14:50] <purplefox> it's just a standard way of getting a dump

[12:15:04] <purplefox> i assume you are on linux>

[12:15:05] <purplefox> ?

[12:15:28] <purplefox> maybe your process is not called “java”

[12:15:48] <andyhedges> it's called java8 but I fixed that

[12:15:58] <andyhedges> also tried with the PID, strange

[12:16:08] <andyhedges> just googling for what might cause this

[12:16:37] <purplefox> try kill -9 <pid> and tell me what happens?

[12:17:01] <andyhedges> just got one with jstack if you are interested

[12:17:30] <purplefox> sure, however you get it, doesn't matter ;)

[12:17:49] <andyhedges> :)

[12:18:03] <purplefox> maybe you don't have a console attached to the process

[12:18:13] <purplefox> kill -3 outputs to the console

[12:20:02] <andyhedges> http://pastebin.com/SqEwZxHe

[12:20:18] <Sticky> “Found 1 deadlock”

[12:20:26] <andyhedges> Indeed, but why

[12:20:54] <andyhedges> I'm checking the code, probs something embrassing I did :S

[12:21:04] <purplefox> you have two event loops deadlocking - this should never happen. can you provide a reproducer?

[12:21:41] <Sticky> andyhedges: no, this is not your fault

[12:21:50] <andyhedges> I can only reproduce on AWS, but I'll try and pull together some code to do so tonight

[12:21:55] <purplefox> in normal use an httpserverresponse should only be accessed by the same event loop

[12:22:04] <purplefox> but here it is being accessed by two different ones

[12:22:30] <andyhedges> There's only one Verticle with one HttpServer in it

[12:22:42] <purplefox> one verticle instance?

[12:22:57] <andyhedges> Yep, spawned from main

[12:23:05] <andyhedges> lemme double check that

[12:23:10] <purplefox> very strange

[12:23:25] <purplefox> I need to go into a meeting soon, but if you could create a reproducer we can take a look

[12:23:37] <andyhedges> Will do, I have meeting too :'(

[12:23:47] <andyhedges> Will get something together as soon as I can

[12:23:59] <andyhedges> Post to eclipse bugzilla?

[12:24:48] <purplefox> if you can push something to github that would be ideal

[12:25:12] <purplefox> btw.. i remember someone posted almost identical issue recently on the google group

[12:30:37] <andyhedges> OK, will take a look

[12:30:42] <andyhedges> Will github it, sure.

[12:32:09] <aesteve> yikes, I was about to push my beta app to AWS this evening. I will keep an eye on it, too

[12:32:40] <andyhedges> The other guy isn't using AWS Linux fwiw

[12:32:47] <andyhedges> from the google group

[12:32:56] <andyhedges> perhaps this just makes it happen faster

[12:33:07] <andyhedges> his manifests after 12 hours

[12:33:15] <andyhedges> mind I can do with a few calls to http

[12:33:19] <andyhedges> mine*

[12:54:18] <Sticky> it is almost certainly about the speed of the machine/number of cores

[12:54:30] <Sticky> that make the deadlock more likely

[12:55:05] <Sticky> probably nothing specifically AWS linux related about it

[14:33:06] <andyhedges> I agree

[14:33:16] <andyhedges> although all the cloud machines are single core

[14:47:53] <purplefox> pmlopes: temporalfox cescoffier hi folks

[14:48:02] <cescoffier> Hi purplefox

[14:48:08] <cescoffier> how was Newcastle ?

[14:48:10] <pmlopes> hi purplefox

[14:48:17] <purplefox> been having an extremely hectic few days, but hopefully things will be back to normal soon

[14:48:25] <purplefox> newcastle was windy ;)

[14:49:36] <purplefox> so.. how it going with you guys?

[14:50:46] <cescoffier> smoothly on my side. Had fixed a couple of thinks in the javascript generation (the semi-colon) and in the doc gen

[14:51:09] <cescoffier> right now, I'm writting the docker manual with all the required content

[14:51:15] <cescoffier> should be done tonight

[14:51:19] <purplefox> great

[14:51:39] <cescoffier> tomorrow will focus on the core documentation

[14:52:40] <cescoffier> anything urgent I need to do in the meantime ?

[14:54:56] <cescoffier> I've to sync with temporalfox, but probably friday I will run a release dry run - to be sure everything works smoothly

[14:55:06] <purplefox> we should have a meeting soon to discuss what remains to be done

[14:56:03] <purplefox> but for now, examples, docs, docker, openshift that's all good

[14:56:37] <cescoffier> yep

[14:57:07] <cescoffier> and everything will be documented in a central place (the manual I'm writting) as well as the fabric 8 metadata, ruby, js and groovy examples

[14:57:14] <cescoffier> and a distributed applicaiton example too

[14:57:32] <cescoffier> however, the distributed app is _cheating_ right now

[14:58:22] <purplefox> cool

[14:58:30] <cescoffier> as I'm on mac, I'm using the boot2docker VM and multicast is working. I've also a working example with unicast, but would like to try it on a true distributed environment (with several machine running there own docker containers)

[14:58:40] <cescoffier> waiting to get my machine to do that ;-)

[14:58:56] <purplefox> pmlopes: hi paulo, how are you?

[14:59:45] <pmlopes> purplefox, i am fine, just spent almost all morning fighting with the tokens and kerberos but i am all set up now

[15:00:06] <cescoffier> pmlopes : did you use the google auth way too ?

[15:01:06] <pmlopes> no, i got some troubles with it so i just picked a stare yubikey that i have here

[15:01:26] <pmlopes> and it works perfect, just tap and i am in

[15:02:19] <purplefox> ok folks so regarding the work to do for 3.0, it's mainly just docs, examples, website and fixing bugs

[15:02:56] <cescoffier> we should sync on the doc writing

[15:03:18] <purplefox> yeah

[15:03:45] <purplefox> temporalfox: hi julien, are you there?

[15:04:01] <purplefox> there are a few holes in the docs right now

[15:07:50] <purplefox> bbiab

[15:08:21] <cescoffier> I've made two PR about the doc last week (https://github.com/eclipse/vert.x/pulls/cescoffier)

[15:09:16] <cescoffier> (don't ask why one pass the CLA validation and not the other one, while both has the same email….)

[15:09:44] <temporalfox> purplefox hi

[15:10:02] <purplefox> hi julien, how are things?

[15:10:47] <temporalfox> things are doing ok :-)

[15:13:50] <purplefox> thanks for doing the release

[15:13:53] <temporalfox> purplefox how is it going for you ?

[15:18:03] <purplefox> temporalfox: the last few days has been disruptive but i am looking forward to getting stuff done for the rest of the week

[15:59:59] <pmlopes> purplefox, temporalfox: i've completed an example of mongo, web and jade templates, to which repo should i upload?

[16:00:19] <purplefox> vertx-examples

[16:00:39] <purplefox> just send a PR to that :)

[16:02:18] <pmlopes> humm… that means that i need to integrate it with the examples runner, right?

[16:03:04] <temporalfox> pmlopes is it a vertx CLI example ?

[16:03:21] <pmlopes> no, it is a web app

[16:03:29] <temporalfox> ok

[16:03:51] <temporalfox> I'll try to make something that generates CLI examples from examples

[16:04:01] <temporalfox> so we can use them to test the CLI version easily

[16:08:25] <purplefox> ?

[16:08:39] <purplefox> all examples can be run at the command line too

[16:09:06] <purplefox> not sure i understand the issue here

[16:09:08] <temporalfox> how do you do that ?

[16:09:27] <purplefox> cd <example dir>

[16:09:28] <temporalfox> I haven't tried actually

[16:09:32] <purplefox> vertx run <example name>

[16:09:33] <temporalfox> does it work for non java ?

[16:09:46] <purplefox> yes, it even explains this in the readme

[16:09:51] <temporalfox> ok cool :-)

[16:10:05] <purplefox> that's kind of the point of the examples

[16:10:15] <temporalfox> I thought they were IDE only :-)

[16:10:31] <temporalfox> in M5 for instance, we haven't bundled the various template engine dependencies in the distrib

[16:10:37] <purplefox> temporalfox: rtfm ;)

[16:10:46] <temporalfox> there are so many manual to read :-)

[16:11:01] <temporalfox> cescoffier is even writing a new one today :-)

[16:11:09] <purplefox> it's just the main README on the examples project

[16:12:19] <cescoffier> in my case it's quite easy, I'm in docker ;-)

[16:12:40] <cescoffier> in or on top or against…. depends on my mood

[16:35:32] <aesteve> pmlopes: are you using an embedded mongo database for your example ?

[16:35:59] <pmlopes> aesteve: no i run a local mongo…

[16:36:46] <aesteve> ok I was wondering if I could submit my own example too, but was afraid I couldn't because Redis & Mongo aren't embedded

[16:37:00] <aesteve> and I wasn't sure it could be run :(

[16:37:48] <purplefox> why not use an embedded one?

[16:38:19] <aesteve> cause I only though about it recently ;)

[16:38:24] <pmlopes> aesteve: you're right it should start an embedded mongo to avoid external dependencies

[16:38:42] <aesteve> in my case I only need a Redis one

[16:42:54] <aesteve> s/only/aslo

[17:44:11] <andyhedges> Another question, is a callback handler always excuted on the same thread that passed it?

[18:37:49] <temporalfox> andyhedges yes, unless it's a non vertx thread

[18:38:48] <temporalfox> for instance in a JUnit test or embedded in a mail, that won't be the same thread

[18:40:36] <temporalfox> andyhedges here is an example https://github.com/vietj/vertx-materials/blob/master/src/main/asciidoc/Demystifying_the_event_loop.adoc#embedding-vertx

[18:43:37] <andyhedges> Even a vertx-mongo-client call back

[18:43:54] <andyhedges> will read the link

[18:49:31] <temporalfox> andyhedges for proxies like mongo-client it will depends on the context

[18:49:56] <temporalfox> well with mongo-client it's not a proxy

[18:50:08] <temporalfox> mongo client uses executeBlocking under

[18:50:18] <temporalfox> so read the executeBlocking section :-)

[18:50:38] <temporalfox> if you use mongo-service it will be the event bus

[19:24:52] <andyhedges> so I'm using vertx-mongo-client and so I think you are saying the the callback could be on a different thread, is that right, going to read that section now, thanks for the help :)

[19:33:38] <andyhedges> OK, so I've just proved to myself they do operate on different threads

[19:33:50] <andyhedges> and I'm assuming that's the desired behaviour

[19:34:03] <andyhedges> so I've got to read you doco in more detail now…

[19:34:08] <andyhedges> your*

[21:23:29] <AlexLehm> if I have context problems in a unit test, does it make sense to fix the unit test or is this an issue that could happen in “real” uses as well