Discussion:
[jboss-user] [JBoss Remoting] - SSL bisocket socketWrite0() hangs, while holding Messaging Post Office lock (on a network failure)
Todd Sandor
2012-08-17 21:31:34 UTC
Permalink
Todd Sandor [https://community.jboss.org/people/toddler] created the discussion

"SSL bisocket socketWrite0() hangs, while holding Messaging Post Office lock (on a network failure)"

To view the discussion, visit: https://community.jboss.org/message/754327#754327

--------------------------------------------------------------
Hi ...

We are using EAP5.1.1 (remoting-2-4-5.SP2) and experience a problem when using
SSL and there is a network hard failure (e.g. ethernet cable pulled), a socketWrite0() hangs
while holding a  Messaging Post Office lock (readLock()) - this hangs JMS...(many threads
need PostOffice).  Thread dump below.  It is difficult to reproduce.

A Solaris 10 environment.

This problem has only been observed when SSL is used.

In our remoting-bisocket-service.xml file, we use:
<attribute isParam="true" name="writeTimeout">30000</attribute>

Have been told "writeTimeout" is broken when SSL is being used.

It has been suggested, for SSL, we change the configuration to turn on the check_connection parameter:
<attribute isParam="true" name="socket.check_connection">true</attribute>

But my concern is:
* results in an extra round-trip I/O (not good for a variety of reasons - throughput, remote clients with high latency connecitons, etc.).
* potential to hit other problems with check_connections enabled.   e.g. found this:
https://community.jboss.org/message/369442#369442 https://community.jboss.org/message/369442

Has anyone experienced this problem and/or know a solution (other then using
"socket.check_connection") might be?
Searched and found a number of remoting issues on hard-network failure, but not this particular one...

Potentially Related Issue:
https://issues.jboss.org/browse/JBREM-1307 https://issues.jboss.org/browse/JBREM-1307

Thead Dump below:

2012.07.26 13:23:38 -0500 INFO  [STDOUT] "WorkerThread#1[10.143.109.23:60321]" Id=74 RUNNABLE (in native)
    at java.net.SocketOutputStream.socketWrite0(Native Method)
    at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
    at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
    at com.sun.net.ssl.internal.ssl.OutputRecord.writeBuffer(OutputRecord.java:297)
    at com.sun.net.ssl.internal.ssl.OutputRecord.write(OutputRecord.java:286)
    at com.sun.net.ssl.internal.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:743)
    at com.sun.net.ssl.internal.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:731)
    at com.sun.net.ssl.internal.ssl.AppOutputStream.write(AppOutputStream.java:59)
    -  locked ***@ac9224
    at org.jboss.remoting.transport.socket.TimedOutputStream.write(TimedOutputStream.java:119)
    at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
    -  locked ***@1aea5f0
    at java.io.DataOutputStream.write(DataOutputStream.java:90)
    -  locked ***@16e88df
    at java.io.FilterOutputStream.write(FilterOutputStream.java:80)
    at org.jboss.messaging.core.impl.message.MessageSupport.write(MessageSupport.java:387)
    at org.jboss.jms.message.JBossMessage.write(JBossMessage.java:1069)
    at org.jboss.jms.wireformat.ClientDelivery.write(ClientDelivery.java:91)
    at org.jboss.jms.wireformat.JMSWireFormat.write(JMSWireFormat.java:237)
    at org.jboss.remoting.transport.socket.MicroSocketClientInvoker.versionedWrite(MicroSocketClientInvoker.java:1325)
    at org.jboss.remoting.transport.socket.MicroSocketClientInvoker.transport(MicroSocketClientInvoker.java:861)
    at org.jboss.remoting.transport.bisocket.BisocketClientInvoker.transport(BisocketClientInvoker.java:470)
    at org.jboss.remoting.MicroRemoteClientInvoker.invoke(MicroRemoteClientInvoker.java:169)
    at org.jboss.remoting.Client.invoke(Client.java:2070)
    at org.jboss.remoting.Client.invoke(Client.java:879)
    at org.jboss.remoting.Client.invokeOneway(Client.java:928)
    at org.jboss.remoting.callback.ServerInvokerCallbackHandler.handleCallback(ServerInvokerCallbackHandler.java:835)
    at org.jboss.remoting.callback.ServerInvokerCallbackHandler.handleCallbackOneway(ServerInvokerCallbackHandler.java:708)
    at org.jboss.jms.server.endpoint.ServerSessionEndpoint.performDelivery(ServerSessionEndpoint.java:1600)
    at org.jboss.jms.server.endpoint.ServerSessionEndpoint.handleDelivery(ServerSessionEndpoint.java:1512)
    -  locked ***@1105e96
    at org.jboss.jms.server.endpoint.ServerConsumerEndpoint.handle(ServerConsumerEndpoint.java:353)
    -  locked ***@1b1dde3
    at org.jboss.messaging.core.impl.RoundRobinDistributor.handle(RoundRobinDistributor.java:119)
    at org.jboss.messaging.core.impl.MessagingQueue$DistributorWrapper.handle(MessagingQueue.java:617)
    at org.jboss.messaging.core.impl.ClusterRoundRobinDistributor.handle(ClusterRoundRobinDistributor.java:79)
    at org.jboss.messaging.core.impl.ChannelSupport.deliverInternal(ChannelSupport.java:681)
    at org.jboss.messaging.core.impl.MessagingQueue.deliverInternal(MessagingQueue.java:540)
    at org.jboss.messaging.core.impl.ChannelSupport.handle(ChannelSupport.java:251)
    -  locked ***@5e7463
    at org.jboss.messaging.core.impl.postoffice.MessagingPostOffice.routeInternal(MessagingPostOffice.java:3133)
    at org.jboss.messaging.core.impl.postoffice.MessagingPostOffice.route(MessagingPostOffice.java:956)
    at org.jboss.jms.server.endpoint.ServerConnectionEndpoint.sendMessage(ServerConnectionEndpoint.java:794)
    at org.jboss.jms.server.endpoint.ServerSessionEndpoint.send(ServerSessionEndpoint.java:435)
    at org.jboss.jms.server.endpoint.advised.SessionAdvised.org$jboss$jms$server$endpoint$advised$SessionAdvised$send$aop(SessionAdvised.java:87)
    at org.jboss.jms.server.endpoint.advised.SessionAdvised$send_7280680627620114891.invokeTarget(SessionAdvised$send_7280680627620114891.java)
    at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:111)
    at org.jboss.jms.server.container.SecurityAspect.handleSend(SecurityAspect.java:158)
    at sun.reflect.GeneratedMethodAccessor224.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.jboss.aop.advice.PerInstanceAdvice.invoke(PerInstanceAdvice.java:122)
    at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:102)
    at com.timetra.nms.server.j2ee.jms.messaging.JBossMessagingServerInterceptor.handleSessionSend(JBossMessagingServerInterceptor.java:346)
    at sun.reflect.GeneratedMethodAccessor223.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.jboss.aop.advice.PerInstanceAdvice.invoke(PerInstanceAdvice.java:122)
    at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:102)
    at org.jboss.jms.server.endpoint.advised.SessionAdvised.send(SessionAdvised.java)
    at org.jboss.jms.wireformat.SessionSendRequest.serverInvoke(SessionSendRequest.java:95)
    at org.jboss.jms.server.remoting.JMSServerInvocationHandler.invoke(JMSServerInvocationHandler.java:165)
    at org.jboss.remoting.ServerInvoker.invoke(ServerInvoker.java:967)
    at org.jboss.remoting.transport.socket.ServerThread.completeInvocation(ServerThread.java:791)
    -  locked ***@465cc
    at org.jboss.remoting.transport.socket.ServerThread.processInvocation(ServerThread.java:744)
    at org.jboss.remoting.transport.socket.ServerThread.dorun(ServerThread.java:586)
    at org.jboss.remoting.transport.socket.ServerThread.run(ServerThread.java:234)

    Number of locked synchronizers = 1
    - java.util.concurrent.locks.ReentrantLock$***@12b4a3e
--------------------------------------------------------------

Reply to this message by going to Community
[https://community.jboss.org/message/754327#754327]

Start a new discussion in JBoss Remoting at Community
[https://community.jboss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2050]
Todd Sandor
2012-08-17 21:39:18 UTC
Permalink
Todd Sandor [https://community.jboss.org/people/toddler] created the discussion

"Re: SSL bisocket socketWrite0() hangs, while holding Messaging Post Office lock (on a network failure)"

To view the discussion, visit: https://community.jboss.org/message/754330#754330

--------------------------------------------------------------
Ooops ...Forgot to specify, the lock is held here:

at org.jboss.messaging.core.impl.postoffice.MessagingPostOffice.routeInternal(MessagingPostOffice.java:3133)

Note sure whey the thread-dump does not display that it is holding this lock...?


Many/many other threads then waiting for this lock ...e.g;

2012.07.26 13:23:38 -0500 INFO  [STDOUT] "WorkerThread#122[10.217.178.53:3868]" Id=8486 WAITING on EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$***@1956aca
    at java.lang.Object.wait(Native Method)
    -  waiting on EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$***@1956aca
    at java.lang.Object.wait(Object.java:485)
    at EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$ReaderLock.acquire(WriterPreferenceReadWriteLock.java:163)
    at org.jboss.messaging.core.impl.postoffice.MessagingPostOffice.getBindingForQueueName(MessagingPostOffice.java:1033)
    at org.jboss.jms.server.endpoint.ServerConsumerEndpoint.localClose(ServerConsumerEndpoint.java:568)
    at org.jboss.jms.server.endpoint.ServerConsumerEndpoint.close(ServerConsumerEndpoint.java:414)
    at org.jboss.jms.server.endpoint.advised.ConsumerAdvised.org$jboss$jms$server$endpoint$advised$ConsumerAdvised$close$aop(ConsumerAdvised.java:59)
    at org.jboss.jms.server.endpoint.advised.ConsumerAdvised$close_N4742752445160157748.invokeTarget(ConsumerAdvised$close_N4742752445160157748.java)
    at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:111)
    at org.jboss.jms.server.container.ServerLogInterceptor.invoke(ServerLogInterceptor.java:105)
    at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:102)
    at org.jboss.jms.server.endpoint.advised.ConsumerAdvised.close(ConsumerAdvised.java)
    at org.jboss.jms.wireformat.CloseRequest.serverInvoke(CloseRequest.java:66)
    at org.jboss.jms.server.remoting.JMSServerInvocationHandler.invoke(JMSServerInvocationHandler.java:165)
    at org.jboss.remoting.ServerInvoker.invoke(ServerInvoker.java:967)
    at org.jboss.remoting.transport.socket.ServerThread.completeInvocation(ServerThread.java:791)
    -  locked ***@16b5e33
    at org.jboss.remoting.transport.socket.ServerThread.processInvocation(ServerThread.java:744)
    at org.jboss.remoting.transport.socket.ServerThread.dorun(ServerThread.java:586)
    at org.jboss.remoting.transport.socket.ServerThread.run(ServerThread.java:234)

..

2012.07.26 13:23:38 -0500 INFO  [STDOUT] "WorkerThread#64[10.143.109.23:41474]" Id=1414 WAITING on EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$***@1956aca
    at java.lang.Object.wait(Native Method)
    -  waiting on EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$***@1956aca
    at java.lang.Object.wait(Object.java:485)
    at EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$ReaderLock.acquire(WriterPreferenceReadWriteLock.java:163)
    at org.jboss.messaging.core.impl.postoffice.MessagingPostOffice.getBindingForQueueName(MessagingPostOffice.java:1033)
...

2012.07.26 13:23:38 -0500 INFO  [STDOUT] "WorkerThread#122[10.217.178.53:3868]" Id=8486 WAITING on EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$***@1956aca
    at java.lang.Object.wait(Native Method)
    -  waiting on EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$***@1956aca
    at java.lang.Object.wait(Object.java:485)
    at EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$ReaderLock.acquire(WriterPreferenceReadWriteLock.java:163)
    at org.jboss.messaging.core.impl.postoffice.MessagingPostOffice.getBindingForQueueName(MessagingPostOffice.java:1033)
...

Cheers
--------------------------------------------------------------

Reply to this message by going to Community
[https://community.jboss.org/message/754330#754330]

Start a new discussion in JBoss Remoting at Community
[https://community.jboss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2050]
Ron Sigal
2012-10-28 03:36:39 UTC
Permalink
Ron Sigal [https://community.jboss.org/people/ron_sigal] created the discussion

"Re: SSL bisocket socketWrite0() hangs, while holding Messaging Post Office lock (on a network failure)"

To view the discussion, visit: https://community.jboss.org/message/772666#772666

--------------------------------------------------------------
Hi Todd,

You've done you're homework.

As you see in JBREM-1307, there is a bug in Java's SSLSocket that can cause a deadlock, and the suggested workaround (I haven't tried it) is to turn on socket.check_connection.  As for your concerns about doing that:

  * results in an extra round-trip I/O (not good for a variety of reasons - throughput, remote clients with high latency connecitons, etc.).

This is true. 

  * potential to hit other problems with check_connections enabled.   e.g. found this: https://community.jboss.org/message/369442#369442 https://community.jboss.org/message/369442

Note that the problem in https://community.jboss.org/message/369442#369442#369442 https://community.jboss.org/message/369442#369442 is based on a failure to set Remoting's socket timeout parameter.   I don't think it's really a deadlock.  The default client side timeout in Remoting is 30 minutes, so it just looks like a deadlock.  I'm not aware of any problems with the connect timeout.  If you're looking at the code, the connect timeout is enforced in org.jboss.remoting.transport.socket.SocketClientInvoker.connect().

Hope that helps.

-Ron
--------------------------------------------------------------

Reply to this message by going to Community
[https://community.jboss.org/message/772666#772666]

Start a new discussion in JBoss Remoting at Community
[https://community.jboss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2050]
Todd Sandor
2013-02-26 14:29:42 UTC
Permalink
Todd Sandor [https://community.jboss.org/people/toddler] created the discussion

"Re: SSL bisocket socketWrite0() hangs, while holding Messaging Post Office lock (on a network failure)"

To view the discussion, visit: https://community.jboss.org/message/799630#799630

--------------------------------------------------------------
I found a workaround to this SSL defect which is to set the socket linger option in OutputTimerTask.run()
before the close() is performed.

I've never contributed to the community source, but below are the
diffs, hopefully someone can get this into the community version to tell me how I go about contributing
this change into the community version.


File to change: (e.g. for tag 2.5.4.SP3)

org.jboss.remoting.transport.socket.ClientSocketWrapper.java
org.jboss.remoting.transport.socket.TimedOutputStream.java


diff ClientSocketWrapper.java ../ClientSocketWrapper.java
223c223,224
<          os = new TimedOutputStream(os, writeTimeout);
---
           Socket mySkt = super.getSocket();
           os = new TimedOutputStream(os, writeTimeout, mySkt);
diff TimedOutputStream.java ../TimedOutputStream.java
26a27
import java.net.Socket;
47a49,60
    private Socket mySkt;
     public Socket getMySkt() {
         return mySkt;
     }
     public TimedOutputStream(OutputStream os, int outputTimeout, Socket pmySkt)
     {
         this.os = os;
         this.outputTimeout = outputTimeout;
         this.mySkt = pmySkt;
     }
134c147
<      
---
142,151c155,192
<          try
<          {
<             log.debug(this + " closing: " + tos);
<             tos.close();
<             tos = null;
<          }
<          catch (IOException e)
<          {
<             log.debug("unable to close " + tos);
<          }
---
           try
           {
               Socket mySocket = tos.getMySkt();
               if (mySocket != null)
               {
                   // set the linger on with duration 0, results RST being sent
                   // and socket send and receive buffers being discarded (does not
                   // wait of the existing write() to complete].  There is no use in
                   // setting the soLingerDuration longer then 0 (thus waiting for the write to
                   // complete), since we've already waited "writeTimeout" seconds on this write.
                   boolean on = true;
                   int soLingerDuration = 0;
                   log.debug(this + " socketWrite TIMEOUT: set soLinger(ON,0): " + tos);
                   mySocket.setSoLinger(on, soLingerDuration);
               }
           }
           catch (IOException e)
           {
               log.debug(" socketWrite TIMEOUT: Attempting to set linger on socket stream [" + tos +
                       "], Exception [" + e.toString() + "]");
           }
           finally
           {
               log.debug(this + " socketWrite TIMEOUT: close(): " + tos);
               try
               {
                   tos.close();
               }
               catch (Exception e)
               {
                   // ignore - log log error
                   log.error("Failed to close() socket stream Exception: ", e);
               }
               log.debug(this + " socketWrite TIMEOUT: close() completed.");
               tos = null;
           }
153c194
<      
---
--------------------------------------------------------------

Reply to this message by going to Community
[https://community.jboss.org/message/799630#799630]

Start a new discussion in JBoss Remoting at Community
[https://community.jboss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2050]
Ron Sigal
2013-04-02 23:37:27 UTC
Permalink
Ron Sigal [https://community.jboss.org/people/ron_sigal] created the discussion

"Re: SSL bisocket socketWrite0() hangs, while holding Messaging Post Office lock (on a network failure)"

To view the discussion, visit: https://community.jboss.org/message/806116#806116

--------------------------------------------------------------
Hi  Todd,

Thank you for working on this issue.  I've put a link to this thread on JBREM-1307.  I don't have any projection for a next release of Remoting 2, but when the time comes, I will certainly look at patch.

And give you all the credit.  ;)

-Ron
--------------------------------------------------------------

Reply to this message by going to Community
[https://community.jboss.org/message/806116#806116]

Start a new discussion in JBoss Remoting at Community
[https://community.jboss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2050]
Loading...