Add `contains?` helper function to boot.janet #1017

Techcable · 2022-08-26T04:09:54Z

Before adding this, the simplest way to do this was (not (nil? (index-of val collection))) or (truthy? (index-of val collection)).

In my opinion, this is significantly less clear than the new API (contains? collection val).

Also adds four new related functions:

collection? - Tests for collection types (intentionally excluding strings)
contains-key? - Checks if a collection contains a key
- For arrays, this checks the index
contains-value? - Checks if a collection contains a value

I would say this use case for contains? is pretty important.

Most major programming languages offer some sort of contains function (Python, Java, Rust).
In Python it's even a keyword (in).

Some of the other functions are more debatable.

I intentionally excluded strings for the reasons described in the contains? comment.

TODO

This is significantly clearer than using (not (nil? (index-of col val))) Most major programming languages offer some sort of contains function (Python, Java, C, Rust). The only exception I know of is C.

bakpakin · 2022-08-26T16:36:35Z

You are like the 4th person trying to add a contains? function to core. Also contains-value? is identical to contains? (you can use index-of on dictionaries). contains-key? is the same as just (get ds key) with the optional check for nil.

So to reiterate:

(defn contains? [ds value] (not= nil (index-of value ds)))
(defn contains-key? [ds key] (not= nil (get ds key)))

is what you want. Anything else is just overhead, your "efficient" implementation is no more efficient than this.

EDIT: the not= nil check is the only value add here besides the new names, since collections deliberately do not allow nil keys, but do allow false as a key,

EDIT 2:
The above definitions also work correctly on byte sequences, unlike your own implementation which decides to raise a type error (unlike anything else in core).

bakpakin · 2022-08-26T16:39:14Z

I guess your "collection?" classification could be defined as recursive associative arrays, which is useful.

bakpakin · 2022-08-26T17:02:16Z

Looking again at your implementation, you decided to make contains? check keys for tables and structs and values for arrays and and tuples. I will not merge that.

Note this actually changes behavior from a thin wrapper over `index-of`. This is because `(index-of 3 3)` throws "error: expected iterable type, got 3"

This is just documentation of existing behavior, it does not change anything. The reason index-of throws a type error on non-iterable types is because `next` does. This is hardcoded into the JOP_NEXT opcode (see src/core/value.c:janet_next_impl). Unfortunately, there is currently no corresponding `iterable?` check.

Remove `contains-value?` because it is now redundant. Clarify in the documentation that it checks dictionary values.

Also contains-value?

No longer used to guard the type tests.

I need unit tests....

Techcable · 2022-08-26T23:40:39Z

I have removed contains-value?, removed the type error (it now returns false) and changed to support all iterable types.

Note that the builtin next function throws a type error if the value is not "iterable".
This function is implemented as JOP_NEXT and delegates to value.c:janet_next_impl (which throws an error here

This causes (index-of 3 3) to throw type errors.

Because contains? is a test function, I figure it should never throw errors (as you pointed out).
Instead, it just returns false. I do this by wrapping it in a try.

So to reiterate:
(defn contains? [ds value] (not= nil (index-of value ds)))
(defn contains-key? [ds key] (not= nil (get ds key)))
is what you want. Anything else is just overhead, your "efficient" implementation is no more efficient than this.

This is mostly the new implementation. The only change is that index-of must be wrapped in a try block to catch type errors from next and index-of (as mentioned above).

The above definitions also work correctly on byte sequences, unlike your own implementation which decides to raise a type error (unlike anything else in core).

I have changed the function to remove type tests. It should now support all iterable values, including byte sequences, generators (fibers) and abstract types.

It actually now suppresses the type errors (as discussed above).

Looking again at your implementation, you decided to make contains? check keys for tables and structs and values for arrays and and tuples. I will not merge that.

Yes. I agree this is an inconsistency. This has been corrected. The contains? functions is now equivalent to the old contains-value?.

Remaining issues (will add to TODO):

Unit tests (I'm not sure which "suite" should I add this too??)
Check over docs (again)
Consider adding string/contains? for finding substrings
- This is only a convenience/naming thing, semantically it would just be equivalent to (not (nil? (string/find needle haystack)))

bakpakin · 2022-08-27T00:01:10Z

Remove the try. In dynamic languages, the usual idea is garbage in, garbage out. You misunderstood my point about the type error. “Test” functions are not special in that regard.

bakpakin · 2022-08-27T00:02:17Z

There already is string/find which does the same as string/contains? You don't even need the nil check since string/find cannot return false.

Techcable · 2022-08-27T03:01:59Z

Remove the try. In dynamic languages, the usual idea is garbage in, garbage out. You misunderstood my point about the type error. “Test” functions are not special in that regard.

Okay, I will remove the documentation I added for the type error in index-of too.
If this is a "garbage out" situation, the type error is not really part of the function's contract (in contrast to Python, where a TypeError is usually part of the function's contract).
We may want the flexibility to change this behavior in the future (maybe change it to return nil).

Documenting the type error for index-of would also create an inconsistency with next, where the type error is not documented.

There already is string/find which does the same as string/contains? You don't even need the nil check since string/find cannot return false.

Understood. I agree. Using (when (string/find ...) ...) or (truthy? (string/find ....)) is good enough.

@bakpakin

> Remove the try. In dynamic languages, the usual idea is garbage in, garbage out. You misunderstood my point about the type error. “Test” functions are not special in that regard. > - @bakpakin

Three reasons: 1. This same behavior is not documented on the `next` function 2. This function does not throw the error directly, it only throws an error because `next` does. 3. Following the same idea as the previous commit, this behavior is more or less implementation-defined for nonsensical types > In dynamic languages, the usual idea is garbage in, garbage out. Various other documentation cleanup.

Techcable · 2022-08-27T04:50:59Z

Unit tests added, documentation cleaned up & corrected, try wrapper removed.

Should be very close to final submission (or dismissal).

Remaining tasks:

Final check over documentation
Final rebase?

bakpakin · 2022-08-27T15:54:35Z

The docstrings don't match the style of other docstrings, including references to null (not a thing in Janet), and in general are very verbose for such simple functions. The documentation is embedded into the Janet binary, please keep it to a couple of sentences, not an essay.

More of a nit, but (not= nil ...) will actually be more efficient than (not (nil? ...)), so prefer that.

bakpakin · 2022-08-27T16:15:10Z

Doing some more thinking and comparing to Clojure, which defines contains? as the contains-key? functions above. To avoid this confusion and be very explicit, I think names like has-key? and has-value? are a bit better, especially as associative arrays are the most general Janet data structure abstraction (rather than the seq as in Clojure), so I think that maps better.

Techcable · 2022-08-27T23:52:10Z

Agreed. I definitely need to clean up (and reduce the size of) the docstrings. The reference to null was just a sloppy mistake on my part.

names like has-key? and has-value? are a bit better

Thank you! This is a much better name :)

pepe

I decided to take a fresh look at this after it was hanging here for a while. Frankly, I see little value in having such functions in boot.janet. In all the library codes I have written, I have not missed such a functionality.

primo-ppcg

contains-key? seems to be completely superfluous. I would, however, like to point out that there is some potential for confusion in the current Core API. One might wrongly assume, as I initially did, that in is intended to be used when it is unknown if the key exists, and get when it is known to exist. Of course, it is precisely the opposite. To someone unfamiliar with the language, the following looks like correct code, and would likely pass review:

(if (in items key)
  ...

when of course, it is not what the author intended. If the language were still pre-release, I would strongly recommend swapping their definitions. That said, it is a very minor hurdle, and not one that justifies a convenience method, in my opinion. And with names like get and in, how much more concise could you possibly want it to be?

One thing I that will agree with, is that index-of should not be used if you don't actually mean index-of. It's a great example of bad practice. As a reviewer, if I encounter code like the following, in any language:

(if (not (index-of val items))
  ...

the first thing I need to do is stop and check the definition of index-of. "What does it return if the value isn't found? Is it possible to return a falsey value even if the value is found?" And even if I think I know the answer to these questions, I'm still going to double check to make sure, and would likely request a change either way. Whereas, if I encounter something like this:

(if (contains? items val)
  ...

I can read straight over it. The intent is clear, and the implementation is obviously a correct expression of that intent.

I'm not suggesting that this function must necessary be called contains? or that portions of this PR should be merged. I am merely submitting my options for consideration. Personally, I would much rather see something like:

(if (some (partial = val) items)
  ...

even if I know it to be less efficient.

zevv · 2023-05-30T04:42:28Z

One thing I that will agree with, is that index-of should not be used if you don't actually mean index-of.

> (if (not (index-of val items))

vs

> (if (contains? items val))

Yes, very much this. I strongly feel that code should reflect the intentions of the programmer. Reading the first line makes me stop and wonder what is happening there, while reading the second shows without doubt what the author is doing there.

Of course every language has its idioms (I read C code like return !!(a & (1<<0x08)) without blinking), but there is some value to using 'well known' names for common operations.

I feel it might be justified to have a function called contains?, returning a boolean, with well defined semantics over tuples, array, structs and objects, even if it adds yet another identifier to the global name space.

sogaiu · 2023-05-30T04:54:02Z

I appreciate:

reflect the intentions of the programmer

but I don't think it's always cost-free (efficiency-wise but also from the perspective of compatibility, growing code base, and maintenance).

Having said that, I don't know whether those parenthetical concerns apply to this case.

primo-ppcg · 2023-05-31T14:29:02Z

The three major selling points of Janet for me:

it's small
it's fast
it's expressive

code should reflect the intentions of the programmer

Does the language allow this? Does it encourage this, by design? That is expressiveness.

sogaiu · 2023-06-01T01:58:31Z

it's small
it's fast
it's expressive

Yes, nice!

Hopefully for many cases (including the one this PR is about) it's not a "choose 2 out of 3" situation :)

Does the language allow this? Does it encourage this, by design? That is expressiveness.

I can understand including "allow" in such a definition. I can appreciate the value of "encourage", but am not so familiar with including that in "expressiveness" [1]. (I like avoiding lists of 4 things though (^^;)

I would submit though that sometimes what is "encouraged" shields us from other possibilities that may occasionally be useful or better.

[1] Thanks for the food for thought -- I have something to mull over.

primo-ppcg · 2023-06-01T02:25:39Z

I can appreciate the value of "encourage", but am not so familiar with including that in "expressiveness"

What I mean specifically, is that if it is more convenient to write code that clearly expresses your intent to other developers (and your furture self) than not, then the language has encouraged you to do this, by design.

sogaiu · 2023-06-01T02:28:03Z

Ah, thank you for the elaboration. That makes sense.

bakpakin · 2023-06-01T03:28:22Z

Since most of the work has already been done and test cases added, and there is some positive feedback, I will merge this. Still there is some work needed here to make doc strings match the other docstrings, and I will rename the pair of functions to has-key? and has-value?.

Add utilities for contains? and contains-key?

12a1849

This is significantly clearer than using (not (nil? (index-of col val))) Most major programming languages offer some sort of contains function (Python, Java, C, Rust). The only exception I know of is C.

Techcable changed the title ~~Add contains? helper function to boot.janet~~ Add contains? helper function to boot.janet Aug 26, 2022

Techcable added 6 commits August 26, 2022 14:23

on bad type, contains? shuld return false (not error)

765eb84

Note this actually changes behavior from a thin wrapper over `index-of`. This is because `(index-of 3 3)` throws "error: expected iterable type, got 3"

Make contains? consistently iterate over values.

927e9e4

Remove `contains-value?` because it is now redundant. Clarify in the documentation that it checks dictionary values.

Clarify documentation of contains?

754b61c

Also contains-value?

Remove collection? type test

7203c04

No longer used to guard the type tests.

Fix incorrect usage of the test macro

f778e8b

I need unit tests....

Techcable added 3 commits August 26, 2022 20:36

Remove try from contains?, allowing type errors

a206124

> Remove the try. In dynamic languages, the usual idea is garbage in, garbage out. You misunderstood my point about the type error. “Test” functions are not special in that regard. > - @bakpakin

Add unit tests for contains?

e78a3d1

bakpakin marked this pull request as ready for review September 19, 2022 22:00

pepe reviewed Feb 6, 2023

View reviewed changes

primo-ppcg reviewed May 30, 2023

View reviewed changes

bakpakin merged commit 0fcbda2 into janet-lang:master Jun 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `contains?` helper function to boot.janet #1017

Add `contains?` helper function to boot.janet #1017

Techcable commented Aug 26, 2022 •

edited

Loading

bakpakin commented Aug 26, 2022 •

edited

Loading

bakpakin commented Aug 26, 2022

bakpakin commented Aug 26, 2022

Techcable commented Aug 26, 2022

bakpakin commented Aug 27, 2022 •

edited

Loading

bakpakin commented Aug 27, 2022 •

edited

Loading

Techcable commented Aug 27, 2022 •

edited

Loading

Techcable commented Aug 27, 2022

bakpakin commented Aug 27, 2022

bakpakin commented Aug 27, 2022

Techcable commented Aug 27, 2022

pepe left a comment

primo-ppcg left a comment •

edited

Loading

zevv commented May 30, 2023

sogaiu commented May 30, 2023

primo-ppcg commented May 31, 2023

sogaiu commented Jun 1, 2023

primo-ppcg commented Jun 1, 2023

sogaiu commented Jun 1, 2023

bakpakin commented Jun 1, 2023

Add contains? helper function to boot.janet #1017

Add contains? helper function to boot.janet #1017

Conversation

Techcable commented Aug 26, 2022 • edited Loading

TODO

bakpakin commented Aug 26, 2022 • edited Loading

bakpakin commented Aug 26, 2022

bakpakin commented Aug 26, 2022

Techcable commented Aug 26, 2022

bakpakin commented Aug 27, 2022 • edited Loading

bakpakin commented Aug 27, 2022 • edited Loading

Techcable commented Aug 27, 2022 • edited Loading

Techcable commented Aug 27, 2022

bakpakin commented Aug 27, 2022

bakpakin commented Aug 27, 2022

Techcable commented Aug 27, 2022

pepe left a comment

Choose a reason for hiding this comment

primo-ppcg left a comment • edited Loading

Choose a reason for hiding this comment

zevv commented May 30, 2023

sogaiu commented May 30, 2023

primo-ppcg commented May 31, 2023

sogaiu commented Jun 1, 2023

primo-ppcg commented Jun 1, 2023

sogaiu commented Jun 1, 2023

bakpakin commented Jun 1, 2023

Add `contains?` helper function to boot.janet #1017

Add `contains?` helper function to boot.janet #1017

Techcable commented Aug 26, 2022 •

edited

Loading

bakpakin commented Aug 26, 2022 •

edited

Loading

bakpakin commented Aug 27, 2022 •

edited

Loading

bakpakin commented Aug 27, 2022 •

edited

Loading

Techcable commented Aug 27, 2022 •

edited

Loading

primo-ppcg left a comment •

edited

Loading