45 - amq_server aborts when it received socket error event

Reported by dchawdchaw (1233958305|%O ago)

Description of issue in more detail, especially how to reproduce.

Steps to reproduce:
1. Run the amq_server
2. Have many clients open concurrent conections to amq_server.
In my scenario, I have 245 clients try to connect to amq_server simultaneously. Each client will respectively open two connections to amq_server.
3. The amq_server will receive socket error event and treats it as unrecognized error event and abort. Please refer to backtrace below.

Note: In my setup, max file descriptors per process is 1024.

Feature request is to that OpenAMQ should fail gracefully (such as reject future connection request) once it encounters socket error when it reaches max fd per process limit.

Please refer to the backtrace below for debug info. Note: I also have the core file if somebody needs it.

Core was generated
‘./amq_server -s amq_server.cfg’.
Program terminated with signal 6, Aborted.
#0 0x0000003d26430155 in raise () from /lib64/libc.so.6
(gdb) bt
#0 0x0000003d26430155 in raise () from /lib64/libc.so.6
#1 0x0000003d26431bf0 in abort () from /lib64/libc.so.6
#2 0x0000000000466032 in report_unrecognised_event_error (thread=0x19034aa8)
at amq_server_agent.c:14422
#3 0x0000000000467cc6 in amq_server_agent_manager (thread_p=0x4564e110)
at amq_server_agent.c:4378
#4 0x000000000052bdcd in s_execute (apr_thread=<value optimized out>,
data=0x18f3c540) at smt_os_thread.c:3216
#5 0x0000003d27006307 in start_thread () from /lib64/libpthread.so.0
#6 0x0000003d264d1ded in clone () from /lib64/libc.so.6

(gdb) thread apply all bt

Thread 10 (process 19962):
#0 0x0000003d2700a4b6 in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib64/libpthread.so.0
#1 0x000000000052a671 in smt_wait (msecs=0)
at /home/dchaw/wrk/apps/OpenAMQ-1.2e1/base-2.2b1/_install/include/icl.h:2080
#2 0x000000000040499a in main (argc=3, argv=0x7fff846ec008)
at amq_server_main.inc:281

Thread 9 (process 19963):
#0 0x0000003d264cb332 in select () from /lib64/libc.so.6
#1 0x00000000004d9bc5 in apr_sleep (t=<value optimized out>)
at time/unix/time.c:246
#2 0x0000000000528792 in s_time_update (apr_thread=<value optimized out>,
data=0x0) at smt_os_thread.c:2777
#3 0x0000003d27006307 in start_thread () from /lib64/libpthread.so.0
#4 0x0000003d264d1ded in clone () from /lib64/libc.so.6

Thread 8 (process 19964):
#0 0x0000003d2700a4b6 in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib64/libpthread.so.0
#1 0x000000000052ba14 in s_execute (apr_thread=<value optimized out>,
data=<value optimized out>)
at /home/dchaw/wrk/apps/OpenAMQ-1.2e1/base-2.2b1/_install/include/icl.h:2080
#2 0x0000003d27006307 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003d264d1ded in clone () from /lib64/libc.so.6

Thread 7 (process 19965):
#0 0x0000003d264c92a6 in poll () from /lib64/libc.so.6
#1 0x000000000054e342 in apr_pollset_poll (pollset=0x18e0c890, timeout=543,
num=0x42e4a0e4, descriptors=0x42e4a0d8) at poll/unix/poll.c:301
#2 0x000000000053daf0 in smt_socket_request_wait (os_thread=0x18e0c2f8)
at smt_socket_request.c:2246
#3 0x000000000052b6f6 in s_execute (apr_thread=<value optimized out>,
data=0x18e44820) at smt_os_thread.c:3031
#4 0x0000003d27006307 in start_thread () from /lib64/libpthread.so.0
#5 0x0000003d264d1ded in clone () from /lib64/libc.so.6

Thread 6 (process 19966):
#0 0x0000003d2700a4b6 in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib64/libpthread.so.0
#1 0x000000000052ba14 in s_execute (apr_thread=<value optimized out>,
data=<value optimized out>)
at /home/dchaw/wrk/apps/OpenAMQ-1.2e1/base-2.2b1/_install/include/icl.h:2080
#2 0x0000003d27006307 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003d264d1ded in clone () from /lib64/libc.so.6

Thread 5 (process 19967):
#0 0x0000003d2700a4b6 in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib64/libpthread.so.0
#1 0x000000000052ba14 in s_execute (apr_thread=<value optimized out>,
data=<value optimized out>)
at /home/dchaw/wrk/apps/OpenAMQ-1.2e1/base-2.2b1/_install/include/icl.h:2080
#2 0x0000003d27006307 in start_thread () from /lib64/libpthread.so.0
#3 0x0000003d264d1ded in clone () from /lib64/libc.so.6

-Type <return> to continue, or q <return> to quit-
Thread 4 (process 19968):
#0 0x0000003d264c92a6 in poll () from /lib64/libc.so.6
#1 0x000000000054e342 in apr_pollset_poll (pollset=0x18ec3da0, timeout=6957,
num=0x44c4d0e4, descriptors=0x44c4d0d8) at poll/unix/poll.c:301
#2 0x000000000053daf0 in smt_socket_request_wait (os_thread=0x18ec3808)
at smt_socket_request.c:2246
#3 0x000000000052b6f6 in s_execute (apr_thread=<value optimized out>,
data=0x18efddd0) at smt_os_thread.c:3031
#4 0x0000003d27006307 in start_thread () from /lib64/libpthread.so.0
#5 0x0000003d264d1ded in clone () from /lib64/libc.so.6

Thread 3 (process 19970):
#0 0x0000003d264c92a6 in poll () from /lib64/libc.so.6
#1 0x000000000054e342 in apr_pollset_poll (pollset=0x18f40c80, timeout=24,
num=0x4604f0e4, descriptors=0x4604f0d8) at poll/unix/poll.c:301
#2 0x000000000053daf0 in smt_socket_request_wait (os_thread=0x18f406e8)
at smt_socket_request.c:2246
#3 0x000000000052b6f6 in s_execute (apr_thread=<value optimized out>,
data=0x18f7acb0) at smt_os_thread.c:3031
#4 0x0000003d27006307 in start_thread () from /lib64/libpthread.so.0
#5 0x0000003d264d1ded in clone () from /lib64/libc.so.6

Thread 2 (process 19971):
#0 0x0000003d264c92a6 in poll () from /lib64/libc.so.6
#1 0x000000000054e342 in apr_pollset_poll (pollset=0x18f7f3f0, timeout=34,
num=0x46a500e4, descriptors=0x46a500d8) at poll/unix/poll.c:301
#2 0x000000000053daf0 in smt_socket_request_wait (os_thread=0x18f7ee58)
at smt_socket_request.c:2246
#3 0x000000000052b6f6 in s_execute (apr_thread=<value optimized out>,
data=0x18fb9420) at smt_os_thread.c:3031
#4 0x0000003d27006307 in start_thread () from /lib64/libpthread.so.0
#5 0x0000003d264d1ded in clone () from /lib64/libc.so.6

Thread 1 (process 19969):
#0 0x0000003d26430155 in raise () from /lib64/libc.so.6
#1 0x0000003d26431bf0 in abort () from /lib64/libc.so.6
#2 0x0000000000466032 in report_unrecognised_event_error (thread=0x19034aa8)
at amq_server_agent.c:14422
#3 0x0000000000467cc6 in amq_server_agent_manager (thread_p=0x4564e110)
at amq_server_agent.c:4378
#4 0x000000000052bdcd in s_execute (apr_thread=<value optimized out>,
data=0x18f3c540) at smt_os_thread.c:3216
#5 0x0000003d27006307 in start_thread () from /lib64/libpthread.so.0
#6 0x0000003d264d1ded in clone () from /lib64/libc.so.6

Attachments:

No files attached to this page.

Comments

Add a New Comment

Edit | Files | Tags | Print

rating: +1+x

Who's following this issue?

pieterhpieterh
martin_sustrikmartin_sustrik
dchawdchaw
CybariteCybarite
Watch: site | category | page

Submitted by dchawdchaw

Use one of these tags to say what kind of issue it is:

  • issue - a fault in the software or the packaging or the documentation.
  • change - a change or feature request.

Use one of these tags to say what state the issue is in:

  • open - a new, open issue.
  • closed - issue has been closed.
  • rejected - the issue has been rejected.

Use one of these tags to say how urgent the issue is:

  • fatal - the issue is stopping all work.
  • urgent - it's urgent.

All open

89 - multi-threaded client connection failure (17 Nov 2012 16:28) [open]
87 - Zyre returns incomplete XML (26 Apr 2010 08:15) [open]
86 - SFL 'random(num)' macro is wrong in sfl.h (31 Mar 2010 09:23) [open]
85 - Zyre does not start on Solaris (23 Mar 2010 01:29) [open]
84 - OpenAMQ JMS - AMQTopic constructor use HEADER name and class instead of TOPIC (28 Jan 2010 17:04) [open]
83 - WireAPI: How to 'override' signal handlers? (14 Jan 2010 17:33) [open]
82 - Opf Classes Cannot Accept Default Values With Characte (06 Jan 2010 09:34) [open]
81 - AMQP Topic Exhange Routing (29 Dec 2009 00:21) [open]
80 - OpenAMQ reports malformed frame on 0-9-1 queue.unbind (20 Nov 2009 12:33) [open]
79 - AMQ Server crashing if subscribe topic is set as #.# (30 Oct 2009 06:11) [open]
78 - Error while publishing the messages faster (30 Oct 2009 05:57) [open]
77 - Tuning for latency (28 Oct 2009 16:47) [open]
76 - New user forum (28 Oct 2009 11:29) [change open]
74 - Simulaneous connect/disconnect from multiple threads crashes (03 Sep 2009 15:32) [open]
73 - Topic Exchange not sending a message to XXX.* (25 Aug 2009 21:10) [open]
72 - amq_content_basic_new() causes seg fault if not connected to broker (12 Aug 2009 23:50) [open]
71 - zyre bugs (06 Aug 2009 09:33) [open]
69 - OpenAMQ and Zyre (15 Jul 2009 11:27) [open]
68 - Change names of max and min source code macros (10 Jul 2009 16:52) [open]
67 - Server crash when multiple consumers ack on shared queue (26 Jun 2009 11:35) [open]

page 1 of 212next »

Most recent