Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changed net/connect to be non-blocking / asynchronous #1139

Merged
merged 3 commits into from
May 20, 2023

Conversation

zevv
Copy link
Contributor

@zevv zevv commented May 16, 2023

This PR makes net/connect non-blocking, which I believe is (apart from name resolving) the last change needed to make the Janet network impl all nice and async.

I'd appreciate some feedback though:

  • can anyone who knows the event loop code tells me if I did this right?
  • closing the socket on failing connect is still missing; what's the right way to handle this?
  • not sure what to do on getsockopt() failure, panic, assert?

@zevv zevv force-pushed the async-connect branch from f818581 to df715d4 Compare May 16, 2023 17:01
@bakpakin
Copy link
Member

bakpakin commented May 17, 2023

With a cursory glance, everything looks correct. The intuition about calling janet_stream_close on a bad connect looks ok at first glance, any particular reason it is commented out?

not sure what to do on getsockopt() failure, panic, assert?

Don't assert, instead panic and clean up as if connect had failed. Generally, we don't assert anything about syscalls and try to convert errno -> janet panic.

@zevv
Copy link
Contributor Author

zevv commented May 17, 2023

With a cursory glance, everything looks correct. The intuition about calling janet_stream_close on a bad connect looks ok at first glance, any particular reason it is commented out?

Yes, SIGSEGV later because of a use-after-free, but I wasn't able yet to fully grasp the mechanics involved.

Don't assert, instead panic and clean up as if connect had failed. Generally, we don't assert anything about syscalls and try to convert errno -> janet panic.

Right, so that would be janet_cancel(s->fiber, janet_cstringv(strerror(r))); I guess, because the call fails "on behalf" of the fiber calling net/connect

@zevv
Copy link
Contributor Author

zevv commented May 17, 2023

@bakpakin When calling janet_stream_close():

connect
==894789== Invalid read of size 8
==894789==    at 0x132FCC: janet_loop1_impl (in /home/ico/external/janet/build/janet)
==894789==    by 0x1565B4: janet_loop1 (in /home/ico/external/janet/build/janet)
==894789==    by 0x15670C: janet_loop (in /home/ico/external/janet/build/janet)
==894789==    by 0x156CD2: janet_loop_fiber (in /home/ico/external/janet/build/janet)
==894789==    by 0x11BCAC: main (in /home/ico/external/janet/build/janet)
==894789==  Address 0x4cea900 is 0 bytes inside a block of size 88 free'd
==894789==    at 0x484317B: free (vg_replace_malloc.c:872)
==894789==    by 0x132A17: janet_stream_close (in /home/ico/external/janet/build/janet)
==894789==    by 0x132D68: ev_machine_write (in /home/ico/external/janet/build/janet)
==894789==    by 0x132F99: janet_loop1_impl (in /home/ico/external/janet/build/janet)
==894789==    by 0x1565B4: janet_loop1 (in /home/ico/external/janet/build/janet)
==894789==    by 0x15670C: janet_loop (in /home/ico/external/janet/build/janet)
==894789==    by 0x156CD2: janet_loop_fiber (in /home/ico/external/janet/build/janet)
==894789==    by 0x11BCAC: main (in /home/ico/external/janet/build/janet)
==894789==  Block was alloc'd at
==894789==    at 0x48407B4: malloc (vg_replace_malloc.c:381)
==894789==    by 0x133237: janet_listen (in /home/ico/external/janet/build/janet)
==894789==    by 0x13374B: janet_ev_connect (in /home/ico/external/janet/build/janet)
==894789==    by 0x145B42: cfun_net_connect (in /home/ico/external/janet/build/janet)
==894789==    by 0x15103C: run_vm (in /home/ico/external/janet/build/janet)
==894789==    by 0x1553EA: janet_continue_no_check (in /home/ico/external/janet/build/janet)
==894789==    by 0x15632D: janet_loop1 (in /home/ico/external/janet/build/janet)
==894789==    by 0x15670C: janet_loop (in /home/ico/external/janet/build/janet)
==894789==    by 0x156CD2: janet_loop_fiber (in /home/ico/external/janet/build/janet)
==894789==    by 0x11BCAC: main (in /home/ico/external/janet/build/janet)

@zevv zevv force-pushed the async-connect branch from df715d4 to 52d1fbf Compare May 18, 2023 08:42
@zevv zevv force-pushed the async-connect branch from 52d1fbf to 8d78fb1 Compare May 18, 2023 08:55
@zevv
Copy link
Contributor Author

zevv commented May 18, 2023

So, on closer inspection the problem seems to be that closing the stream at that point is too early, because later in the loop implementation the stream and it states are still referenced and checked for other events.

I guess the right thing to do would be to flag the stream as 'to be closed' (add a stream flag?), and after the last loop iteration close the stream if this flag is set.

@zevv zevv force-pushed the async-connect branch from f52cb7a to 8d78fb1 Compare May 18, 2023 09:28
@zevv zevv force-pushed the async-connect branch from 606eeeb to c3e28bc Compare May 18, 2023 12:10
@bakpakin
Copy link
Member

Is there any need to close the stream, or can you just drop it and let the GC take care of it? Might be easier that way - I haven't run with the code yet locally, but elsewhere in the event loop implementation we don't explicitly close anything until the user closes it.

@bakpakin
Copy link
Member

On second thought, it is probably better to have to close rather than leaving file descriptors open for an undetermined amount of time.

@bakpakin
Copy link
Member

LGTM, I will run some BSD testing and if that checks out I will merge.

@zevv
Copy link
Contributor Author

zevv commented May 19, 2023

Yeah, GC-ing file descriptors or sockets is kind of nasty, I agree we better clean them up.

@bakpakin
Copy link
Member

There is a bug when JANET_NO_EPOLL is defined -> a JanetListenerState is used after being deallocated.

Here is a fix:

diff --git a/src/conf/janetconf.h b/src/conf/janetconf.h
index 73e39d55..82b173e4 100644
--- a/src/conf/janetconf.h
+++ b/src/conf/janetconf.h
@@ -49,7 +49,7 @@
 /* #define JANET_STACK_MAX 16384 */
 /* #define JANET_OS_NAME my-custom-os */
 /* #define JANET_ARCH_NAME pdp-8 */
-/* #define JANET_EV_NO_EPOLL */
+#define JANET_EV_NO_EPOLL
 /* #define JANET_EV_NO_KQUEUE */
 /* #define JANET_NO_INTERPRETER_INTERRUPT */
 
diff --git a/src/core/ev.c b/src/core/ev.c
index 31b0a2f6..fd3a7b9a 100644
--- a/src/core/ev.c
+++ b/src/core/ev.c
@@ -1969,6 +1969,7 @@ void janet_loop1_impl(int has_timeout, JanetTimestamp timeout) {
         JanetAsyncStatus status3 = JANET_ASYNC_STATUS_NOT_DONE;
         JanetAsyncStatus status4 = JANET_ASYNC_STATUS_NOT_DONE;
         state->event = pfd;
+        JanetStream *stream = state->stream;
         if (mask & POLLOUT)
             status1 = state->machine(state, JANET_ASYNC_EVENT_WRITE);
         if (mask & POLLIN)
@@ -1983,7 +1984,6 @@ void janet_loop1_impl(int has_timeout, JanetTimestamp timeout) {
                 status4 == JANET_ASYNC_STATUS_DONE)
             janet_unlisten(state, 0);
         /* Close the stream if requested and no more listeners are left */
-        JanetStream *stream = state->stream;
         if ((stream->flags & JANET_STREAM_TOCLOSE) && !stream->state) {
             janet_stream_close(stream);
         }

@zevv
Copy link
Contributor Author

zevv commented May 19, 2023

Applied, thanks.

Would it make sense to run CI with the JANET_EV_NO_EPOLL case as well?

@bakpakin
Copy link
Member

We already do on sr.ht which is why I don't here, but I guess it would help for contributions.

@bakpakin bakpakin merged commit b621d4d into janet-lang:master May 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants