In that case, each line is processed sequentially, with a complete array being created between each step. Nothing actually gets pipelined.
Despite being clean and readable, I don't tend to do it any more, because it's harder to debug. More often these days, I write things like this:
data = File.readlines("haystack.txt")
data = data.map(&:strip)
data = data.grep(/needle/)
data = data.map { |i| i.gsub('foo', 'bar') }
data = data.map { |i| File.readlines(i).count }
It's ugly, but you know what? I can set a breakpoint anywhere and inspect the intermediate states without having to edit the script in prod. Sometimes ugly and boring is better.
wahern [3 hidden]5 mins ago
> The author keeps calling it "pipelining", but I think the right term is "method chaining". [...] You get a similar effect with coroutines.
The inventor of the shell pipeline, Douglas McIlroy, always understood the equivalency between pipelines and coroutines; it was deliberate. See https://www.cs.dartmouth.edu/~doug/sieve/sieve.pdf It goes even deeper than it appears, too. The way pipes were originally implemented in the Unix kernel was when the pipe buffer was filled[1] by the writer the kernel continued execution directly in the blocked reader process without bouncing through the scheduler. Effectively, arguably literally, coroutines; one process call the write function and execution continues with a read call returning the data.
Interestingly, Solaris Doors operate the same way by design--no bouncing through the scheduler--unlike pipes today where long ago I think most Unix kernels moved away from direct execution switching to better support multiple readers, etc.
[1] Or even on the first write? I'd have to double-check the source again.
nine_k [3 hidden]5 mins ago
In Python, such steps like map() and filter() would execute concurrently, without large intermediate arrays. It lacks the chaining syntax for them, too.
Java streams are the closest equivalent, both by the concurrent execution model, and syntactically. And yes, the Java debugger can show you the state of the intermediate streams.
marhee [3 hidden]5 mins ago
I don’t find your “seasoned developer” version ugly at all. It just looks more mature and relaxed. It also has the benefits that you can actually do error handling and have space to add comments.
Maybe people don’t like it because of the repetition of “data =“ but in fact you could use descriptive new variable names making the code even more readable (auto documenting).
I’ve always felt method chaining to look “cramped”, if that’s the right word. Like a person drawing on paper but only using the upper left corner. However, this surely is also a matter of preference or what your used to.
freehorse [3 hidden]5 mins ago
I have a lot of code like this. The reason I prefer pipelines now is the mental overhead of understanding the intermediate step variables.
is a hell to read and understand later imo. You have to read a lot of intermediate variables that do not matter in anything else in the code after you set it up, but you do not know in advance necessarily which matter and which don't unless you read and understand all of it. Also, it pollutes your workspace with too much stuff, so while this makes it easier to debug, it makes it also harder to read some time after. Moreover becomes even more crumpy if you need to repeat code. You probably need to define a function block then, which moves the crumpiness there.
What I do now is starting defining the transformation in each step as a pure function, and chain them after once everything works, plus enclosing it into an error handler so that I depend on breakpoint debugging less.
There is certainly a trade off, but as a codebase grows larger and deals with more cases where the same code needs to be applied, the benefits of a concise yet expressive notation shows.
deredede [3 hidden]5 mins ago
Code in this "named-pipeline" style is already self-documenting: using the same variable name makes it clear that we are dealing with a pipeline/chain. Using more descriptive names for the intermediate steps hides this, making each line more readable (and even then you're likely to end up with `dataStripped = data.map(&:strip)`) at the cost of making the block as a whole less readable.
ehnto [3 hidden]5 mins ago
In most debuggers I have used, if you put a breakpoint on the first line of the method chain, you can "step over" each function in the chain until you get to the one you want.
Bit annoying, but serviceable. Though there's nothing wrong with your approach either.
grimgrin [3 hidden]5 mins ago
debuggers can take it even further if they want that UX. in firefox given a chain of foo().bar().baz() you can set a breakpoint on any of 'em.
I do the same with Python, replacing multilevel comprehensions with intermediary steps of generator expressions, which are lazy and therefore do not impact performance and memory usage.
Ultimately it will depend on the functions being chained. If they can work with one part of the result, or a subset of parts, then they might not block, otherwise they will still need to get a complete result and the lazy cannot help.
hbogert [3 hidden]5 mins ago
Not much different from having a `sort` in shell pipeline I guess?
ses1984 [3 hidden]5 mins ago
Shouldn’t modern debuggers be able to handle that easily? You can step in, step out, until you get where you want, or you could set a breakpoint in the method you want to debug instead of at the call site.
abirch [3 hidden]5 mins ago
Even if your debugger can't do that, an AI agent can easily change it for you.
AdieuToLogic [3 hidden]5 mins ago
> The author keeps calling it "pipelining", but I think the right term is "method chaining".
I believe the correct definition for this concept is the Thrush combinator[0]. In some ML-based languages[1], such as F#, the |> operator is defined[2] for same:
[1..10] |> List.map (fun i -> i + 1)
Other functional languages have libraries which also provide this operator, such as the Scala Mouse[3] project.
I'm not sure that's right, method chaining is just immediately acting on the return of the previous function, directly. It doesn't pass the return into the next function like a pipeline. The method must exist on the returned object. That is different to pipelines or thrush operators. Evaluation happens in the order it is written.
Unless I misunderstood the author, because method chaining is super common where I feel thrush operators are pretty rare, I would be surprised if they meant the latter.
dzuc [3 hidden]5 mins ago
For debugging method chains you can just use `tap`
3np [3 hidden]5 mins ago
I have to object against reusing the 'data' var. Make up a new name for each assignment in particular when types and data structures change (like the last step is switching from strings to ints).
Other than that I think both styles are fine.
hiq [3 hidden]5 mins ago
I agree with this comment: https://news.ycombinator.com/item?id=43759814 that this pollutes current scope, which is especially bad if scoping is not that narrow (the case in Python where if-branches do not define their own scope, I don´t know for Ruby).
Another problem of having different names for each step is that you can no longer quickly comment out a single step to try things out, which you can if you either have the pipeline or a single variable name.
adolph [3 hidden]5 mins ago
Isn't the difference between a pipeline and a method chain that a pipeline doesn't have to wait for the previous process to complete in order to send results to the next step? Grep sends lines as it finds them to sed and sed on to xargs, which acts as a sink to collect the data (an is necessary otherwise wc -l would write out a series of ones).
Given File.readlines("haystack.txt"), the entire file must be resident in memory before .grep(/needle/) is performed, which may cause unnecessary utilization. Iirc, in frameworks like Polars, the collect() chain ending method tells the compiler that the previous methods will be performed as a stream and thus not require pulling the entirety into memory in order to perform an operation on a subset of the corpus.
refactor_master [3 hidden]5 mins ago
> Despite being clean and readable, I don't tend to do it any more, because it's harder to debug. More often these days, I write things like this:
data = File.readlines("haystack.txt")
data = data.map(&:strip)
data = data.grep(/needle/)
data = data.map { |i| i.gsub('foo', 'bar') }
data = data.map { |i| File.readlines(i).count }
Hard disagree. It's less readable, the intend is unclear (where does it end?), and the variables are rewritten on every step and everything is named "data" (and please don't call them data_1, data_2, ...) so now you have to run a debugger to figure out what even is going on, rather than just... reading the code.
veidr [3 hidden]5 mins ago
The person you are quoting already conceded that is less readable, but that the ability to set a breakpoint easily (without having to stop the process and modify the code) is more important.
I myself agree, and find myself doing that too, especially in frontend code that executes in a browser. Debuggability is much more important than marginally-better readability, for production code.
axblount [3 hidden]5 mins ago
Syntactic sugar can sometimes fool us into thinking the underlying process is more efficient or streamlined. As a new programmer, I probably would have assumed that "storing" `data` at each step would be more expensive.
bjoli [3 hidden]5 mins ago
Reading this, I am so happy that my first language was a scheme where I could see the result of the first optimization passes.
This helped me quickly develop a sense for how code is optimized and what code is eventually executed.
wahern [3 hidden]5 mins ago
It absolutely becomes very inefficient, though the threshold data set size varies according to context. Most languages don't have lightweight coroutines as an alternative (but see Lua!), so the convenient alternatives have larger fixed cost. Plus cache locality means cache utilization might be helpful, or even better, as opposed to switching back-and-for every data element, though coroutine-based approaches can also use buffering strategies, which not coincidentally is how pipes work.
But, yes, naive call chaining like that is sometimes a significant performance problem in the real world. For example, in the land of JavaScript. One of the more egregious examples I've personally seen was a Bash script that used Bash arrays rather than pipelines, though in that case it had to do with the loss of concurrency, not data churn.
slt2021 [3 hidden]5 mins ago
if you work with I/O, when you can have all sorts of wrong/invalid data and I/O errors, the chaining is a nightmare, as each chain can have numerous different errors/exceptions.
the chaining really only works if your language is strongly typed and you are somewhat guaranteed that variables will be of expected type.
raverbashing [3 hidden]5 mins ago
Exactly that. It looks nice but it's annoying to debug
I do it in a similar way you mentioned
jjfoooo4 [3 hidden]5 mins ago
I think updating the former to the latter when you are actually debugging something isn’t that big of a deal.
But with actually checked in code, the tradeoff in readability is pretty substantial
inkyoto [3 hidden]5 mins ago
> Each of those components executes in parallel, with the intermediate results streaming between them. You get a similar effect with coroutines.
Processes run in parallel, but they process the data in a strict sequential order: «grep» must produce a chunk of data before «sed» can proceed, and «sed» must produce another chunk of data before «xargs» can do its part. «xargs» in no way can ever pick up the output of «grep» and bypass the «sed» step. If the preceding step is busy crunching the data and is not producing the data, the subsequent step will be blocked (the process will fall asleep). So it is both, a pipeline and a chain.
It is actually a directed data flow graph.
Also, if you replace «haystack.txt» with a /dev/haystack, i.e.
and /dev/haystack is waiting on the device it is attached to to yield a new chunk of data, all of the three, «grep», «sed» and «xargs» will block.
bnchrch [3 hidden]5 mins ago
I'm personally someone who advocates for languages to keep their feature set small and shoot to achieve a finished feature set quickly.
However.
I would be lying if I didn't secretly wish that all languages adopted the `|>` syntax from Elixir.
```
params
|> Map.get("user")
|> create_user()
|> notify_admin()
```
AlchemistCamp [3 hidden]5 mins ago
I've been using Elxir for a long time and had that same hope after having experienced how clear, concise and maintainable apps can be when the core is all a bunch of pipelines (and the boundary does error handling using cases and withs). But having seen the pipe operator in Ruby, I now think it was a bad idea.
The problem is that method-chaining is common in several OO languages, including Ruby. This means the functions on an object return an object, which can then call other functions on itself. In contrast, the pipe operator calls a function, passing in what's on the left side of it as the first argument. In order to work properly, this means you'll need functions that take the data as the first argument and return the same shape to return, whether that's a list, a map, a string or a struct, etc.
When you add a pipe operator to an OO language where method-chaining is common, you'll start getting two different types of APIs and it ends up messier than if you'd just stuck with chaining method calls. I much prefer passing immutable data into a pipeline of functions as Elixir does it, but I'd pick method chaining over a mix of method chaining and pipelines.
Cyykratahk [3 hidden]5 mins ago
We might be able to cross one more language off your wishlist soon, Javascript is on the way to getting a pipeline operator, the proposal is currently at Stage 2
A while ago, I wondered how close you could get to a pipeline operator using existing JavaScript features. In case anyone might like to have a look, I wrote a proof-of-concept function called "Chute" [1]. It chains function and method calls in a dot-notation style like the basic example below.
chute(7) // setup a chute and give it a seed value
.toString // call methods of the current data (parens optional)
.parseInt // send the current data through global native Fns
.do(x=>[x]) // through a chain of one or more local / inline Fns
.JSON.stringify // through nested global functions (native / custom)
.JSON.parse
.do(x=>x[0])
.log // through built in Chute methods
.add_one // global custom Fns (e.g. const add_one=x=>x+1)
() // end a chute with '()' and get the result
It also has barely seen any activity in years. It is going nowhere. The TC39 committee is utterly dysfunctional and anti-progress, and will not let any this or any other new syntax into JavaScript. Records and tuples has just been killed, despite being cited in surveys as a major missing feature[1]. Pattern matching is stuck in stage 1 and hasn't been presented since 2022. Ditto for type annotations and a million other things.
Our only hope is if TypeScript finally gives up on the broken TC39 process and starts to implement its own syntax enhancements again.
I wouldn’t hold your breath for TypeScript introducing any new supra-JS features. In the old days they did a little bit, but now those features (namely enums) are considered harmful.
More specifically, with the (also ironically gummed up in tc39) type syntax [1], and importantly node introducing the --strip-types option [2], TS is only ever going to look more and more like standards compliant JS.
Records and Tuples weren't stopped because of tc39, but rather the engine developers. Read the notes.
Osiris [3 hidden]5 mins ago
It was also replaced with the Composite proposal, which is similar but not exactly the same.
davidmurdoch [3 hidden]5 mins ago
Aren't the engine devs all part of the TC39 committee? I know they stopped SIMD in JS because they were mire interested in shipping WASM, and then adding SIMD to it.
TehShrike [3 hidden]5 mins ago
I was excited for that proposal, but it veered off course some years ago – some TC39 members have stuck to the position that without member property support or async/await support, they will not let the feature move forward.
It seems like most people are just asking for the simple function piping everyone expects from the |> syntax, but that doesn't look likely to happen.
packetlost [3 hidden]5 mins ago
I don't actually see why `|> await foo(bar)` wouldn't be acceptable if you must support futures.
I'm not a JS dev so idk what member property support is.
cogman10 [3 hidden]5 mins ago
Seems like it'd force the rest of the pipeline to be peppered with `await` which might not be desirable
My guess is the TC committee would want this to be more seamless.
This also gets weird because if the `|>` is a special function that sends in a magic `%` parameter, it'd have to be context sensitive to whether or not an `async` thing happens within the bounds. Whether or not it does will determine if the subsequent pipes are dealing with a future of % or just % directly.
packetlost [3 hidden]5 mins ago
It wouldn't though? The first await would... await the value out of the future. You still do the syntactic transformation with the magic parameter. In your example you're awaiting the future returned by getFuture twice and improperly awaiting the output of baz (which isn't async in the example).
(assuming getFuture and bat are both async). You do need |> to be aware of the case where the await keyword is present, but that's about it. The above would effectively transform to:
await bat(baz(await getFuture("bar")));
I don't see the problem with this.
porridgeraisin [3 hidden]5 mins ago
Correct me if I'm wrong, but if you use the below syntax
"bar"
|> await getFuture()
How would you disambiguate it from your intended meaning and the below:
"bar"
|> await getFutureAsyncFactory()
Basically, an async function that returns a function which is intended to be the pipeline processor.
Typically in JS you do this with parens like so:
(await getFutureAsyncFactory())("input")
But the use of parens doesn't transpose to the pipeline setting well IMO
packetlost [3 hidden]5 mins ago
I don't think |> really can support applying the result of one of its composite applications in general, so it's not ambiguous.
Given this example:
(await getFutureAsyncFactory("bar"))("input")
the getFutureAsyncFactory function is async, but the function it returns is not (or it may be and we just don't await it). Basically, using |> like you stated above doesn't do what you want. If you wanted the same semantics, you would have to do something like:
("bar" |> await getFutureAsyncFactory())("input")
to invoke the returned function.
The whole pipeline takes on the value of the last function specified.
porridgeraisin [3 hidden]5 mins ago
Ah sorry I didn't explain properly, I meant
a |> await f()
and
a |> (await f())
Might be expected to do the same thing.
But the latter is syntactically undistinguishable from
a |> await returnsF()
What do you think about
a |> f |> g
Where you don't really call the function with () in the pipeline syntax? I think that would be more natural.
packetlost [3 hidden]5 mins ago
It's still not ambiguous. Your second example would be a syntax error (probably, if I was designing it at least) because you're missing the invocation parenthesis after the wrapped value:
a |> (await f())()
which removes any sort of ambiguity. Your first example calls f() with a as its first argument while the second (after my fix) calls and awaits f() and then invokes that result with a as its first argument.
For the last example, it would look like:
a |> (await f())() | g()
assuming f() is still async and returns a function. g() must be a function, so the parenthesis have to be added.
zdragnar [3 hidden]5 mins ago
I worry about "soon" here. I've been excited for this proposal for years now (8 maybe? I forget), and I'm not sure it'll ever actually get traction at this point.
All of their examples are wordier than just function chaining and I worry they’ve lost the plot somewhere.
They list this as a con of F# (also Elixir) pipes:
value |> x=> x.foo()
The insistence on an arrow function is pure hallucination
value |> x.foo()
Should be perfectly achievable as it is in these other languages. What’s more, doing so removes all of the handwringing about await. And I’m frankly at a loss why you would want to put yield in the middle of one of these chains instead of after.
hoppp [3 hidden]5 mins ago
Cool I love it, but another thing we will need polyfills for...
hathawsh [3 hidden]5 mins ago
I believe you meant to say we will need a transpiler, not polyfill. Of course, a lot of us are already using transpilers, so that's nothing new.
bobbylarrybobby [3 hidden]5 mins ago
How do you polyfill syntax?
jononor [3 hidden]5 mins ago
Letting your JS/TS compiler convert it into supported form. Not really a polyfill, but it allows to use new features in the source and still support older targets. This was done a lot when ES6 was new, I remember.
zdragnar [3 hidden]5 mins ago
Polyfills are for runtime behavior that can't be replicated with a simple syntax transformation, such as adding new functions to built-in objects like string.prototype contains or the Symbol constructor and prototype or custom elements.
I haven't looked at the member properties bits but I suspect the pipeline syntax just needs the transform to be supported in build tools, rather than adding yet another polyfill.
rkangel [3 hidden]5 mins ago
I'm a big fan of the Elixir operator, and it should be standard in all functional programming languages. You need it because everything is just a function and you can't do anything like method chaining, because none of the return values have anything like methods. The |> is "just" syntax sugar for a load of nested functions. Whereas the Rust style method chaining doesn't need language support - it's more of a programming style.
Note also that it works well in Elixir because it was created at the same time as most of the standard library. That means that the standard library takes the relevant argument in the first position all the time. Very rarely do you need to pipe into the second argument (and you need a lambda or convenience function to make that work).
Even more concise and it doesn't even require a special language feature, it's just regular syntax of the language ( |> is a method like .get(...) so you could even write `params.get("user").|>(create_user) if you wanted to)
elbasti [3 hidden]5 mins ago
In elixir, ```Map.get("user") |> create_user |> notify_admin ``` would aso be valid, standard elixir, just not idiomatic (parens are optional, but preferred in most cases, and one-line pipes are also frowned upon except for scripting).
MaxBarraclough [3 hidden]5 mins ago
With the disclaimer that I don't know Elixir and haven't programmed with the pipeline operator before: I don't like that special () syntax. That syntax denotes application of the function without passing any arguments, but the whole point here is that an argument is being passed. It seems clearer to me to just put the pipeline operator and the name of the function that it's being used with. I don't see how it's unclear that application is being handled by the pipeline operator.
Also, what if the function you want to use is returned by some nullary function? You couldn't just do |> getfunc(), as presumably the pipeline operator will interfere with the usual meaning of the parentheses and will try to pass something to getfunc. Would |> ( getfunc() ) work? This is the kind of problem that can arise when one language feature is permitted to change the ordinary behaviour of an existing feature in the name of convenience. (Unless of course I'm just missing something.)
freehorse [3 hidden]5 mins ago
I am also confused with such syntax of "passing as first argument" pipes. Having to write `x |> foo` instead of `x |> foo()` does not solve much, because you have the same lack of clarity if you need to pass a second argument. Ie `x |> foo(y)` in this case means `foo(x,y)`, but if `foo(y)` actually gives you a function to apply to `x` prob you should write `x |> foo(y)()` or `x |> (foo(y))()` then as I understand it? If that even makes sense in a language. In any case, you have the same issue as before, in different contexts `foo(y)` is interpreted differently.
I just find this syntax too inconsistent and vague, and hence actually annoying. Which is why I prefer defining pipes as composition of functions which can then be applied to whatever data. Then eg one can write sth like `(|> foo1 foo2 (foo3) #(foo4 % y))` and know that foo1 and foo2 are references to functions, foo3 evaluates to another function, and when one needs more arguments in foo4 they have to explicitly state that. This gives another function, and there is no ambiguity here whatsoever.
valenterry [3 hidden]5 mins ago
> Having to write `x |> foo` instead of `x |> foo()` does not solve much, because you have the same lack of clarity if you need to pass a second argument
That's actually true. In Scala that is not so nice, because then it becomes `x |> foo(_, arg2)` or, even worse, `x |> (param => foo(param, arg2))`. I have a few such cases in my sourcecode and I really don't like it. Haskell and PureScript do a much better job keeping the code clean in such cases.
valenterry [3 hidden]5 mins ago
> It seems clearer to me to just put the pipeline operator and the name of the function that it's being used with.
I agree with that and it confused me that it looks like the function is not referenced but actually applied/executed.
valenterry [3 hidden]5 mins ago
Oh that's nice!
agent281 [3 hidden]5 mins ago
Isn't it being a method call not quite equivalent? Are you able to define the method over arbitrary data types?
In Elixir, it is just a macro so it applies to all functions. I'm only a Scala novice so I'm not sure how it would work there.
valenterry [3 hidden]5 mins ago
> Are you able to define the method over arbitrary data types?
Yes exactly, which is why it is not equivalent. No macro needed here. In Scala 2 syntax:
```
implicit class AnyOps[A](private val a: A) extends AnyVal {
def |>[B](f: A => B) = f(a)
}
```
mvieira38 [3 hidden]5 mins ago
R has a lovely toolkit for data science using this syntax, called the tidyverse. My favorite dev experience, it's so easy to just write code
AdieuToLogic [3 hidden]5 mins ago
> I would be lying if I didn't secretly wish that all languages adopted the `|>` syntax from Elixir.
This is usually the Thrush combinator[0], exists in other languages as well, and can be informally defined as:
Not quite. Note that the Elixir pipe puts the left hand of the pipe as the first argument in the right-hand function. E.g.
x |> f(y) = f(x, y)
As a result, the Elixir variant cannot be defined as a well-typed function, but must be a macro.
matthewsinclair [3 hidden]5 mins ago
Agree. This is absolutely my fave part of Elixir. Whenever I can get something to flow elegantly thru a pipeline like that, I feel like it’s a win against chaos.
Alupis [3 hidden]5 mins ago
Pipelines are one of the greatest Gleam features[1].
I wouldn't say it's a Gleam feature per se, in that it's not something that it's added that isn't already in Elixir.
jasperry [3 hidden]5 mins ago
Yes, a small feature set is important, and adding the functional-style pipe to languages that already have chaining with the dot seems to clutter up the design space. However, dot-chaining has the severe limitation that you can only pass to the first or "this" argument.
Is there any language with a single feature that gives the best of both worlds?
AndyKluger [3 hidden]5 mins ago
Do concatenative langs like Factor fit the bill?
bnchrch [3 hidden]5 mins ago
FWIW you can pass to other arguments than first in this syntax
```
params
|> Map.get("user")
|> create_user()
|> (¬ify_admin("signup", &1)).()
```
or
```
params
|> Map.get("user")
|> create_user()
|> (fn user -> notify_admin("signup", user) end).()
```
Terr_ [3 hidden]5 mins ago
BTW, there's a convenience macro of Kernel.then/2 [0] which IMO looks a little cleaner:
The pipe operator relies on the first argument being the subject of the operation. A lot of languages have the arguments in a different order, and OO languages sometimes use function chaining to get a similar result.
Terr_ [3 hidden]5 mins ago
IIRC the usual workaround in Elixir involves be small lambda that rearranges things:
"World"
|> then(&concat("Hello ", &1))
I imagine a shorter syntax could someday be possible, where some special placeholder expression could be used, ex:
"World"
|> concat("Hello ", &1)
However that creates a new problem: If the implicit-first-argument form is still permitted (foo() instead of foo(&1)) then it becomes confusing which function-arity is being called. A human could easily fail to notice the absence or presence of the special placeholder on some lines, and invoke the wrong thing.
freehorse [3 hidden]5 mins ago
Yeah, R (tidyverse) has `.` as such a placeholder. It is useful but indeed I find the syntax off, though I find the syntax off even without it, anyway. I would rather define pipes as compositions of functions, which are pretty unambiguous in terms of what arguments they get, and then apply these to whatever i want.
agent281 [3 hidden]5 mins ago
Last time I checked (2020) there were already a few rejected proposals to shorten the syntax for this. It seemed like they were pretty exasperated by them at the time.
hinkley [3 hidden]5 mins ago
Yeah I really hate that syntax and I can’t even explain why so I kind of blot it out, but you’re right.
My dislike does improve my test coverage though, since I tend to pop out a real method instead.
sparkie [3 hidden]5 mins ago
You could make use of `flip` from Haskell.
flip :: (x -> y -> z) -> (y -> x -> x)
flip f = \y -> \x -> f x y
x |> (flip f)(y) -- f(x, y)
I feel like Haskell really missed a trick by having $ not go the other way, though it's trivial to make your own symbol that goes the other way.
jose_zap [3 hidden]5 mins ago
Haskell has & which goes the other way:
users
& map validate
& catMaybes
& mapM persist
taolson [3 hidden]5 mins ago
Yes, `&` (reverse apply) is equivalent to `|>`, but it is interesting that there is no common operator for reversed compose `.`, so function compositions are still read right-to-left.
In my programming language, I added `.>` as a reverse-compose operator, so pipelines of function compositions can also be read uniformly left-to-right, e.g.
process = map validate .> catMaybes .> mapM persist
1-more [3 hidden]5 mins ago
Elm (written in Haskell) uses |> and <| for pipelining forwards and backwards, and function composition is >> and <<. These have made it into Haskell via nri-prelude https://hackage.haskell.org/package/nri-prelude (written by a company that uses a lot of Elm in order to make writing Haskell look more like writing Elm).
EDIT: in no way do I want to claim the originality of these things in Elm or the Haskell package inspired by it. AFAIK |> came from F# but it could be miles earlier.
shadytrees [3 hidden]5 mins ago
Maybe not common, but there’s Control.Arrow.(>>>)
lgas [3 hidden]5 mins ago
Also you can (|>) = (&) (with an appropriate fixity declaration) to get
I guess I'm showing how long it's been since I was a student of Haskell then. Glad to see the addition!
layer8 [3 hidden]5 mins ago
It would be even better without the `>`, though. The `|>` is a bit awkward to type, and more noisy visually.
MyOutfitIsVague [3 hidden]5 mins ago
I disagree, because then it can be very ambiguous with an existing `|` operator. The language has to be able to tell that this is a pipeline and not doing a bitwise or operation on the output of multiple functions.
layer8 [3 hidden]5 mins ago
Yes, I’m talking about a language where `|` would be the pipe operator and nothing else, like in a shell. Retrofitting a new operator into an existing language tends to be suboptimal.
neonsunset [3 hidden]5 mins ago
Elixir itself adopted this operator from F#
Straw [3 hidden]5 mins ago
Lisp macros allow a general solution to this that doesn't just handle chained collection operators but allows you to decide the order in which you write any chain of calls.
For example, we can write:
(foo (bar (baz x))) as
(-> x baz bar foo)
If there are additional arguments, we can accommodate those too:
(sin (* x pi) as
(-> x (* pi) sin)
Where expression so far gets inserted as the first argument to any form. If you want it inserted as the last argument, you can use ->> instead:
(filter positive? (map sin x)) as
(->> x (map sin) (filter positive?))
You can also get full control of where to place the previous expression using as->.
Yeah, I found this when I was playing around with Hy a while back. I wanted a generic `->` style operator, and isn't wasn't too much trouble to write a macro to introduce one.
That's sort of an argument for the existence of macros as a whole, you can't really do this as neatly in something like python (although I've tried) - I can see the downside of working in a codebase with hundreds of these kind of custom language features though.
gleenn [3 hidden]5 mins ago
I find the threading operators in Clojure bring much joy and increase readability. I think it's interesting because it makes me actually consider function argument order much more because I want to increase opportunities to use them.
aeonik [3 hidden]5 mins ago
These threading macros can increase performance, the developer even has a parallelizing threading macro.
Yes threading macros are so much nicer than method chaining, because it allows general function reuse, rather than being limited to the methods that happen to be defined in your initial data object.
davemp [3 hidden]5 mins ago
Computer scientists continue to pick terrible names. Pipelining is already an overloaded concept that implies some type of operation level parallelism. Picking names like this does everyone in the field a disservice. Calling it something like “composition chain” would be much clearer with respect to existing literature in the field. Maybe I’m being nitpicky, but sometimes it feels like the tower of babel parable talking to folks who use different ecosystems.
duped [3 hidden]5 mins ago
A pipeline operator is just partial application with less power. You should be able to bind any number of arguments to any places in order to create a new function and "pipe" its output(s) to any other number of functions.
One day, we'll (re)discover that partial application is actually incredibly useful for writing programs and (non-Haskell) languages will start with it as the primitive for composing programs instead of finding out that it would be nice later, and bolting on a restricted subset of the feature.
gpderetta [3 hidden]5 mins ago
for loops are also gotos with less power, yet we usually prefer them.
zelphirkalt [3 hidden]5 mins ago
I like partial application like in Standard ML, but it also means, that one must be very careful with the order of arguments, unless we get a variant of partial application, that is flexible enough to let you specify which arguments you want to provide, instead of always assuming the first n arguments. I use "cut" for this in Scheme. Threading/Pipelines are still very useful though and can shorten things and make them very readable.
dayvigo [3 hidden]5 mins ago
Sure. But how do you write that in a way that is expressive, terse, and readable all at once? Nothing beats x | y | z or (-> x y z). The speed of both writing and reading (and comprehending), the sheer simplicity, is what makes pipelining useful in the first place.
choult [3 hidden]5 mins ago
... and then recreate the scripting language...
stogot [3 hidden]5 mins ago
I was just thinking does this not sound like a shell language? Using | instead of .function()
R + tidyverse is the gold standard for working with data quickly in a readable and maintainable way, IMO. It's just absolutely seamless. Shoutout to tidyverts (https://tidyverts.org/) for working with time series, too
thom [3 hidden]5 mins ago
I've always found magrittr mildly hilarious. R has vestigial Lisp DNA, but somehow the R implementation of pipes was incredibly long, complex and produced stack traces, so it moved to a native C implementation, which nevertheless has to manipulate the SEXPs that secretly underlie the language. Compared to something like Clojure's threading macros it's wild how much work is needed.
Base R as well: |> was implemented as a pipe operator in 4.1.0.
tylermw [3 hidden]5 mins ago
Importantly, the base R pipe implements the operation at the language parsing level, so it has basically zero overhead.
zelphirkalt [3 hidden]5 mins ago
I would assume, that most languages do that, or alternatively have a compiler, that is smart enough to ensure there is no actual overhead in the compiled code.
steine65 [3 hidden]5 mins ago
R, specifically tidyverse, has a special place in my heart. Tidy principles makes data analysis easy to read and easy to use new functions, since there are standards that must be met to call a function "tidy."
Recently I started using Nushell, which feels very similar.
amai [3 hidden]5 mins ago
Pipelining looks nice until you have to debug it. And exception handling is also very difficult, because that means to add forks into your pipelines. Pipelines are only good for programming the happy path.
mpalmer [3 hidden]5 mins ago
At the risk of over generalized pronouncements, ease of debugging is usually down to how well-designed your tooling happens to be. Most of the time the framework/language does that for you, but it's not the only option.
And for exceptions, why not solve it in the data model, and reify failures? Push it further downstream, let your pipeline's nodes handle "monadic" result values.
Point being, it's always a tradeoff, but you can usually lessen the pain more than you think.
And that's without mentioning that a lot of "pipelining" is pure sugar over the same code we're already writing.
jim-jim-jim [3 hidden]5 mins ago
I don't know what you're writing, but this sounds like language smell. If you can represent errors as data instead of exceptions (Either, Result, etc) then it is easy to see what went wrong, and offer fallback states in response to errors.
Programming should be focused on the happy path. Much of the syntax in primitive languages concerning exceptions and other early returns is pure noise.
eikenberry [3 hidden]5 mins ago
Pipelining simplifies debugging. Each step is obvious and it is trivial to insert logging between pipeline elements. It is easier to debug than the patterns compared in the article.
Exception handing is only a problem in languages that use exceptions. Fortunately there are many modern alternatives in wide use that don't use exceptions.
switchbak [3 hidden]5 mins ago
This is my experience too - when the errors are encoded into the type system, this becomes easier to reason about (which is much of the work when you’re debugging).
w4rh4wk5 [3 hidden]5 mins ago
Yes, certainly!
I've encountered and used this pattern in Python, Ruby, Haskell, Rust, C#, and maybe some other languages. It often feels nice to write, but reading can easily become difficult -- especially in Haskell where obscure operators can contain a lot of magic.
Debugging them interactively can be equally problematic, depending on the tooling. I'd argue, it's commonly harder to debug a pipeline than the equivalent imperative code and, that in the best case it's equally hard.
hnlmorg [3 hidden]5 mins ago
Pipelining is just syntactic sugar for nested function calls.
If you need to handle an unhappy path in a way that isn’t optimal for nested function calls then you shouldn’t be nesting your function calls. Pipelining doesn’t magically make things easier nor harder in that regard.
But if a particular sequence of function calls do suit nesting, then pipelining makes the code much more readable because you’re not mixing right-to-left syntax (function nests) with left-to-right syntax (ie you’re typical language syntax).
EVa5I7bHFq9mnYK [3 hidden]5 mins ago
I think they are talking about nested loops, not nested function calls.
hnlmorg [3 hidden]5 mins ago
Nested loops isn’t pipelining. Some of the examples make heavy use of lambda so they do have nested loops happening as well but in those examples the pipelining logic is still the nesting of the lambda functions.
Crudely put, in C-like languages, pipelining is just as way of turning
fn(fn(fn()))
Where the first function call is in the inner, right-most, parentheses,
into this:
fn | fn | fn
…which can be easily read sequentially from left-to-right.
EVa5I7bHFq9mnYK [3 hidden]5 mins ago
Pipelining replaces several consecutive loops with a single loop, doing more complex processing.
bergen [3 hidden]5 mins ago
Depends on the context - in a scripting language where you have some kind of console you just don't copy all lines, and see what each pipe does one after another. This is pretty straight forward.
(Not talking about compiled code though)
rusk [3 hidden]5 mins ago
Established debugging tools and logging rubric are not suitable for debugging heavily pipelined code. Stack traces, debuggers rely heavily on line based references which are less useful in this style and can make diagnostic practices feel a little clumsy.
The old adage of not writing code so smart you can’t debug it applies here.
Pipelining runs contrary enough to standard imperative patterns. You don’t just need a new mindset to write code this way. You need to think differently about how you structure your code overall and you need different tools.
That’s not to say that doing things a different way isn’t great, but it does come with baggage that you need to be in a position to carry.
bsder [3 hidden]5 mins ago
Pipelining is also nice until you have to use it for everything because you can't do alternatives (like default function arguments) properly.
Rust chains everything because of this. It's often unpleasant (see: all the Rust GUI toolkits).
rocqua [3 hidden]5 mins ago
The left associativity of functions really doesn't work well with English reading left to right.
I found this especially clear with the'composition opperator' of functions. Where f.g has to mean f _after_ g because you really want:
f.g = f(g(x))
Based on this, I think a reverse polish type of notation would be a lot better. Though perhaps it is a lot nicer to think of "the sine of an angle" than "angle sine-ed".
Not that it matters much, the switching costs are immense. Getting people able to teach it would be impossible, and collaboration with people taught in the other system would be horrible. I am doubtful I could make the switch, even if I wanted.
bjourne [3 hidden]5 mins ago
In concatenative languages with an implicit stack (Factor) that expression would read:
iter [ alive? ] filter [ id>> ] map collect
The beauty of this is that everything can be evaluated strictly left-to-right. Every single symbol. "Pipelines" in other languages are never fully left-to-right evaluated. For example, ".filter(|w| w.alive)" in the author's example requires one to switch from postfix to infix evaluation to evaluate the filter application.
The major advantage is that handling multiple streams is natural. Suppose you want to compute the dot product of two files where each line contains a float:
While the author claims "semantics beat syntax every day of the week," the entire article focuses on syntax preferences rather than semantic differences.
Pipelining can become hard to debug when chains get very long. The author doesn't address how hard it can be to identify which step in a long chain caused an error.
They do make fun of Python, however. But don't say much about why they don't like it other than showing a low-res photo of a rock with a pipe routed around it.
Ambiguity about what constitutes "pipelining" is the real issue here. The definition keeps shifting throughout the article. Is it method chaining? Operator overloading? First-class functions? The author uses examples that function very differently.
Mond_ [3 hidden]5 mins ago
> Pipelining can become hard to debug when chains get very long. The author doesn't address how hard it can be to identify which step in a long chain caused an error.
Yeah, I agree that this can be problem when you lean heavily into monadic handling (i.e. you have fallible operations and then pipe the error or null all the way through, losing the information of where it came from).
But that doesn't have much to do with the article: You have the same problem with non-pipelined functional code. (And in either case, I think that it's not that big of a problem in practice.)
> The author uses examples that function very differently.
Yeah, this is addressed in one of the later sections. Imo, having a unified word for such a convenience feature (no matter how it's implemented) is better than thinking of these features as completely separate.
zelphirkalt [3 hidden]5 mins ago
You can add peek steps in pipelines and inspect the in between results. Not really any different from normal function call debugging imo.
krapht [3 hidden]5 mins ago
Yes, but here's my hot take - what if you didn't have to edit the source code to debug it? Instead of chaining method calls you just assign to a temporary variable. Then you can set breakpoints and inspect variable values like you do normally without editing source.
It's not like you lose that much readability from
foo(bar(baz(c)))
c |> baz |> bar |> foo
c.baz().bar().foo()
t = c.baz()
t = t.bar()
t = t.foo()
Mond_ [3 hidden]5 mins ago
I feel like a sufficiently good debugger should allow you to place a breakpoint at any of the lines here, and it should break exactly at that specific line.
It sounds to me like you're asking for linebreaks. Chaining doesn't seem to be the issue here.
krapht [3 hidden]5 mins ago
I'm only familiar with C++, Python, and SQL. Neither GDB nor PDB helps here, and I've never heard of a SQL debugger that will break apart expressions and let you view intermediate query results.
Mond_ [3 hidden]5 mins ago
That'd be problematic, but also sounds like a (solvable) tooling problem to me.
cess11 [3 hidden]5 mins ago
You can use EXPLAIN and similar keywords to see the execution plans in common SQL database engines. In practice you don't really care about the actual intermediate data so it doesn't show it, usually it's enough to learn whether indices are used at every step.
But you could in many cases easily infer from the execution plan what a query would look like and fetch an intermediate set separately.
jen20 [3 hidden]5 mins ago
It’s been a while since I’ve used one, but I’m fairly sure the common debuggers for C#, F#, Rust and Java would all behave correctly when breakpointed like this.
Merad [3 hidden]5 mins ago
Jetbrains Rider does this does for C# code (I think Visual Studio does as well). Its inlay hints feature will also show you hints with the result type of each line as the data is transformed. I haven't explicitly tested but I would imagine their IDEs for other languages behave the same.
andyferris [3 hidden]5 mins ago
A debugger should let you inspect the value of any expression, not just variables.
erichocean [3 hidden]5 mins ago
The Clojure equivalent of `c |> baz |> bar |> foo` are the threading macros:
(-> c baz bar foo)
But people usually put it on separate lines:
(-> c
baz
bar
foo)
joeevans1000 [3 hidden]5 mins ago
And with the Emacs Enlighten feature the second version enables seeing the results of each step right in the editor, to the right of the step.
erichocean [3 hidden]5 mins ago
You can achieve something similar in Clojure with the Flowstorm debugger[0] (it's free).
the paragraph you quoted (atm, 7 mins ago, did it change?) says:
>Let me make it very clear: This is [not an] article it's a hot take about syntax. In practice, semantics beat syntax every day of the week. In other words, don’t take it too seriously.
AYBABTME [3 hidden]5 mins ago
It's just as difficult to debug when function calls are nested inline instead of assigning to variables and passing the variables around.
steine65 [3 hidden]5 mins ago
Agreed that long chains are hard to debug. I like to keep chains around the size of a short paragraph.
bena [3 hidden]5 mins ago
I think you may have misinterpreted his motive here.
Just before that statement, he says that it is an article/hot take about syntax. He acknowledges your point.
So I think when he says "semantics beat syntax every day of the week", that's him acknowledging that while he prefers certain syntax, it may not be the best for a given situation.
pavel_lishin [3 hidden]5 mins ago
The article also clearly points that that it's just a hot-take, and to not take it too seriously.
epolanski [3 hidden]5 mins ago
I personally like how effect-ts allows you to write both pipelines or imperative code to express the very same things.
Having both options is great (at the beginning effect had only pipe-based pipelines), after years of writing effect I'm convinced that most of the time you'd rather write and read imperative code than pipelines which definitely have their place in code bases.
In fact most of the community, at large, converged at using imperative-style generators over pipelines and having onboarded many devs and having seen many long-time pipeliners converging to classical imperative control flow seems to confirm both debugging and maintenance seem easier.
andyferris [3 hidden]5 mins ago
Pipelining is great! Though sometimes you want to put the value in the first argument of a function, or a different location, or else call a method... it can be nice to simply refer to the value directly with `_` or `%` or `$` or something.
In fact, I always thought it would be a good idea for all statement blocks (in any given programming language) to allow an implicit reference to the value of the previous statement. The pipeline operation would essentially be the existing semicolons (in a C-like language) and there would be a new symbol or keyword used to represent the previous value.
For example, the MATLAB REPL allows for referring to the previous value as `ans` and the Julia REPL has inherited the same functionality. You can copy-paste this into the Julia REPL today:
[1, 2, 3];
map(x -> x * 2, ans);
@show ans;
filter(x -> x > 2, ans);
@show ans;
sum(ans)
You can't use this in Julia outside the REPL, and I don't think `ans` is a particularly good keyword for this, but I honestly think the concept is good enough. The same thing in JavaScript using `$` as an example:
I feel it would work best with expression-based languages having blocks that return their final value (like Rust) since you can do all sorts of nesting and so-on.
cess11 [3 hidden]5 mins ago
In the Node and Python interpreters you'd use _, in browser JS consoles (and PHP shells like Psysh/Tinker) you'd use $_, in Picolisp @ (or @@ or @@@ for previous computations).
I think most interactive programming shells has an equivalent.
vitus [3 hidden]5 mins ago
I think the biggest win for pipelining in SQL is the fact that we no longer have to explain that SQL execution order has nothing to do with query order, and we no longer have to pretend that we're mimicking natural language. (That last point stops being the case when you go beyond "SELECT foo FROM table WHERE bar LIMIT 10".)
No longer do we have to explain that expressions are evaluated in the order of FROM -> JOIN -> ON -> SELECT -> WHERE -> GROUP BY -> HAVING -> ORDER BY -> LIMIT (and yes, I know I'm missing several other steps). We can simply just express how our data flows from one statement to the next.
(I'm also stating this as someone who has yet to play around with the pipelining syntax, but honestly anything is better than the status quo.)
_dark_matter_ [3 hidden]5 mins ago
You flipped SELECT and WHERE, which probably just solidifies your point. I can't count the number if times I've seen this trip up analysts.
osigurdson [3 hidden]5 mins ago
C# has had "Pipelining" (aka Linq) for 17 years. I do miss this kind of stuff in Go a little.
bob1029 [3 hidden]5 mins ago
I don't see how LINQ provides an especially illuminating example of what is effectively method chaining.
It is an exemplar of expressions [0] more than anything else, which have little to do with the idea of passing results from one method to another.
fn get_ids(data: Vec<Widget>) -> Vec<Id> {
data.iter() // get iterator over elements of the list
.filter(|w| w.alive) // use lambda to ignore tombstoned widgets
.map(|w| w.id) // extract ids from widgets
.collect() // assemble iterator into data structure (Vec)
}
Same thing in 15 year old C# code.
List<Guid> GetIds(List<Widget> data)
{
return data
.Where(w => w.IsAlive())
.Select(w => w.Id)
.ToList();
}
hahn-kev [3 hidden]5 mins ago
So many things have been called Linq over the years it's hard to talk about at this point. I've written C# for many years now and I'm not even sure what I would say it's referring to, so I avoid the term.
In this case I would say extension methods are what he's really referring to, of which Linq to objects is built on top of.
osigurdson [3 hidden]5 mins ago
I'd say there are just two things:
1) The method chaining extension methods on IEnumerable<T> like Select, Where, GroupBy, etc. This is identical to the rust example in the article.
2) The weird / bad (in my opinion) language keywords analogous to the above such as "from", "where", "select" etc.
delusional [3 hidden]5 mins ago
You might be talking about LINQ queries, while the person you are responding to is probably talking about LINQ in Method Syntax[1]
I've used "a series of CTEs" to apply a series of transformations and filters, but it's not nearly as elegant as the pipe syntax.
singularity2001 [3 hidden]5 mins ago
I tried to convince the julia authors to make a.b(c) synonymous to b(a,c) like in nim (for similar reasons as in the article). They didn't like it.
sparkie [3 hidden]5 mins ago
I don't like it either, because it promotes method `b` to the global namespace. There may be many such `b` methods on different, unrelated types. I think that the latter should be prefixed with the typename or module name.
a.b(c) == AType.b(a, c) (or AType::b(a, c) , C++ style)
singularity2001 [3 hidden]5 mins ago
It's the other way around: in Julia b are functions which are globally visible by default and I just suggested to optionally hide them or find them via the object a.
queuebert [3 hidden]5 mins ago
What were their reasons?
pansa2 [3 hidden]5 mins ago
I suspect:
Julia's multiple dispatch means that all arguments to a function are treated equally. The syntax `b(a, c)` makes this clear, whereas `a.b(c)` makes it look like `a` is in some way special.
0xf00ff00f [3 hidden]5 mins ago
First example doesn't look bad in C++23:
auto get_ids(std::span<const Widget> data)
{
return data
| filter(&Widget::alive)
| transform(&Widget::id)
| to<std::vector>();
}
uzerfcwn [3 hidden]5 mins ago
To me, the cool (and uncommon in other languages' standard libraries) part about C++ ranges is that they reify pipelines so that you can cut and paste them into variables, like so:
auto get_ids(std::span<const Widget> data)
{
auto pipeline = filter(&Widget::alive) | transform(&Widget::id);
auto sink = to<std::vector>();
return data | pipeline | sink;
}
Shorel [3 hidden]5 mins ago
This looks awesome!
I'm really want to start playing with some C++23 in the future.
Came here for the Uniform function call syntax link. This is one of the little choices that has a big impact on a language! I love it!
I wrote a little pipeline macro in https://nim-lang.org/ for Advent of Code years ago and as far as I know it worked okay.
```
import macros
macro `|>`\* (left, right : expr): expr =
result = newNimNode(nnkCall)
case right.kind
of nnkCall:
result.add(right[0])
result.add(left)
for i in 1..<right.len:
result.add(right[i])
else:
error("Unsupported node type")
```
Makes me want to go write more nim.
vips7L [3 hidden]5 mins ago
I really wish you couldn't write extensions on nullable types. It's confusing to be able to call what look like instance functions on something clearly nullable without checking.
fun main() {
val s: String? = null
println(s.isS()) // false
}
fun String?.isS() = "s" == this
usrusr [3 hidden]5 mins ago
The difference between .let{} and ?.let{} has great utility. You'd either have to give that up or promote let from regular code in the standard library to magic language feature.
And you'd lose all those cases of extension methods where the convenience of accepting null left of the dot is their sole reason to be. Null is a valid state, not something incredibly scary best dealt with with a full reboot or better yet throwing away the container. Kotlin is about making peace with null, instead of pretending that null does not exist. (yes, I'm looking at you, Scala)
What I do agree with is that extension methods should be a last ditch solution. I'd actually like to see a way to do the nullable receiver thing defined more like regular functions. Perhaps something like
fun? name() = if (this==null) "(Absent)" else this.name
that is defined inside the regular class block, imported like a regular method (as part of the class) and even present in the class object e.g. for reflection on the non-null case (and for Java compat where that still matters)
vips7L [3 hidden]5 mins ago
Personally I find ?.let et al to be terrible for readability; most of the time you're better off doing a standard != null check. The same with nullable extension functions; they hurt readability.
> Null is a valid state, not something incredibly scary best dealt with with a full reboot or better yet throwing away the container. Kotlin is about making peace with null, instead of pretending that null does not exist. (yes, I'm looking at you, Scala)
I honestly find this to be such a weird thing to say or imply. No one is "scared" of null.
mrkeen [3 hidden]5 mins ago
> Null is a valid state, not something incredibly scary best dealt with with a full reboot or better yet throwing away the container.
Agreed. It should be a first-class construct in a language with its own own proper type,
Null null;
rather than needing to hitch a ride with the Integers and Strings like a second-class construct.
pornel [3 hidden]5 mins ago
Rust has such open extensibility through traits. The prime example is Itertools that already adds a bunch of extra pipelining helper methods.
Mond_ [3 hidden]5 mins ago
> The second approach is open for extension - it allows you to write new functions on old datatypes.
I prefer to just generalize the function (make it generic, leverage traits/typeclasses) tbh.
> Probably for lack of > weird operators like <$>, <*>, $, or >>=
Nope btw. I mean, maybe? I don't know Haskell well enough to say. The answer that I was looking for here is a specific Rust idiosyncrasy. It doesn't allow you to import `std::iter::Iterator::collect` on its own. It's an associated function, and needs to be qualified. (So you need to write `Iterator::collect` at the very least.)
higherhalf [3 hidden]5 mins ago
> It doesn't allow you to import `std::iter::Iterator::collect` on its own. It's an associated function, and needs to be qualified.
Oh, interesting! Thank you, I did not know about that, actually.
weinzierl [3 hidden]5 mins ago
I suffer from (what I call) bracket claustrophobia. Whenever brackets get nested too deep I makes me uncomfortable. But I fully realize that there are people who are the complete opposite. Lisp programmers are apparently as claustrophil as cats and spelunkers.
monsieurbanana [3 hidden]5 mins ago
Forget the parenthesis, embrace the automatic indentation and code source manipulations that only perfectly balanced homoiconic expressions can give you.
kissgyorgy [3 hidden]5 mins ago
In Nix, you can do something like this:
gitRef = with lib;
pipe .git/HEAD [
readFile
trim
(splitString ":")
last
trim
(ref: ./git/${ref})
readFile
trim
];
Super clean and cool!
RHSeeger [3 hidden]5 mins ago
I feel like, at least in some cases, the article is going out of its way to make the "undesired" look worse than it needs to be. Compairing
Admittedly, the chaining is still better. But a fair number of the article's complaints are about the lack of newlines being used; not about chaining itself.
the_sleaze_ [3 hidden]5 mins ago
In my eyes newlines don't solve what I feel to be the issue. Reader needs to recognize reading from left->right to right->left.
Of course this really only matters when you're 25 minutes into critical downtime and a bug is hiding somewhere in these method chains. Anything that is surprising needs to go.
IMHO it would be better to set intermediate variables with dead simple names instead of newlines.
fn get_ids(data: Vec<Widget>) -> Vec<Id> {
let iter = iter(data);
let wingdings = map(iter, |w| w.toWingding());
let alive_wingdings = filter(wingdings, |w| w.alive);
let ids = map(alive_wingdings, |w| w.id);
let collected = collect(ids);
collected
}
trealira [3 hidden]5 mins ago
> Reader needs to recognize reading from left->right to right->left.
Yeah, I agree. The problem is that you have to keep track of nesting in the middle of the expression and then unnest it at the end, which is taxing.
So, I also think it could also read better written like this, with the arguments reversed, so you don't have to read it both ways:
That's also what they do in Haskell. The first argument to map is the mapping function, the first argument to filter is the predicate function, and so on. People will often just write the equivalent of:
as their function definitions, with the argument omitted because using the function composition operator looks neater than using a bunch of dollar signs or parentheses.
Making it the second argument only makes sense when functions are written after their first argument, not before, to facilitate writing "foo.map(f).filter(y)".
matt_kantor [3 hidden]5 mins ago
I've been prototyping a programming language[0] with Haskell-like function conventions (all functions are unary and the "primary" parameter comes last). I recently added syntax to allow applying any "binary" function using infix notation, with `a f b` being the same as `f(b)(a)`[1]. Argument order is swapped compared to Haskell's infix notation (where `a f b` would desugar to `f(a)(b)`).
Along with the `|>` operator (which is itself just a function that's conventionally infixed), this turns out to be really nice for flexibility/reusability. All of these programs do the same thing:
[1]: In reality variable dereferencing uses a sigil, but I'm omitting it from this comment to keep the examples focused.
RHSeeger [3 hidden]5 mins ago
The argument ordering Haskell (and, I think, most functional languages) uses is definitely simpler to read. It keeps the components of the tranformation/filter together.
tasuki [3 hidden]5 mins ago
Oh wow, are we living in the same universe? To me the one-line example and your example with line breaks... they just... look about the same?
See how adding line breaks still keeps the `|w| w.alive` very far from the `filter` call? And the `|w| w.id` very far from the `map` call?
If you don't have the pipeline operator, please at least format it something like this:
It's not about line breaks, it's about the order of applying the operations, and about the parameters to the operations you're performing.
RHSeeger [3 hidden]5 mins ago
> It's not about line breaks, it's about the order of applying the operations
For me, it's both. Honestly, I find it much less readable the way you're split it up. The way I had it makes it very easy for me to read it in reverse; map, filter, map, collect
> Also see how this still reads fine despite being one line
It doesn't read fine, to me. I have to spend mental effort figuring out what the various "steps" are. Effort that I don't need to spend when they're split across lines.
For me, it's a "forest for the trees" kind of thing. I like being able to look at the code casually and see what it's doing at a high level. Then, if I want to see the details, I can look more closely at the code.
TOGoS [3 hidden]5 mins ago
They did touch on that.
> You might think that this issue is just about trying to cram everything onto a single line, but frankly, trying to move away from that doesn’t help much. It will still mess up your git diffs and the blame layer.
Diff will still be terrible because adding a step will change the indentation of everything 'before it' (which, somewhat confusingly, are below it syntactically) in the chain.
RHSeeger [3 hidden]5 mins ago
Diff can ignore whitespace, so not really an issue. Not _as_ nice, but not really a problem.
neuroelectron [3 hidden]5 mins ago
I really like the website layout. I'm guessing that they're optimizing for Kindle or other e-paper readers.
pixelmeister [3 hidden]5 mins ago
I recognized this site layout from a past HN post about a solar powered website. Check out their about page. It links to the source for the style that explains why it looks the way it does. Not to spoil it, but it's not for e-readers :)
immibis [3 hidden]5 mins ago
We had this - it was called variables. You could do:
x = iter(data);
y = filter(x, w=>w.isAlive);
z = map(y, w=>w.id);
return collect(z);
It doesn't need new syntax, but to implement this with the existing syntax you do have to figure out what the intermediate objects are, but you also have that problem with "pipelining" unless it compiles the whole chain into a single thing a la Linq.
huyegn [3 hidden]5 mins ago
I liked the pipelining syntax so much from pyspark and linq that I ended up implementing my own mini linq-like library for python to use in local development. It's mainly used in quick data processing scripts that I run locally. The syntax just makes everything much nicer to work with.
and I also dislike Rust requiring you to write "mut" for function mutable values. It's mostly just busywork and dogma.
Mond_ [3 hidden]5 mins ago
Yeah, I really wanted to avoid a discussion over functional vs. imperative programming, so I just... didn't talk about the imperative style at all, and just said so in the first section.
I think the imperative style isn't as readable (of course I would), but that's absolutely a discussion for another day, and I get why people prefer it.
EnPissant [3 hidden]5 mins ago
I think it’s important to point out what the imperative version would look like because I think the fundamental reason that the method chaining approach is more readable is because it more closely resembles imperative code. When reading it, you start with a vector and then you mutate it in various ways before returning it. I understand that’s not how it’s implemented under the hood, but I don’t think it really matters.
otsukare [3 hidden]5 mins ago
I wish more languages would aim for infix functions (like Haskell and Kotlin), rather than specifically the pipe operator.
hliyan [3 hidden]5 mins ago
I always wondered how programming would be if we hadn't designed the assignment operator to be consistent with mathematics, and instead had it go LHS -> RHS, i.e. you perform the operation and then decide its destination, much like Unix pipes.
RodgerTheGreat [3 hidden]5 mins ago
Plenty of LTR languages to choose from, especially concatenative languages like Forth, Joy, or Factor.
The APL family is similarly consistent, except RTL.
donatj [3 hidden]5 mins ago
TI-BASIC is like this with its store operator →. I always liked it.
10→A
A+10→C
recursive [3 hidden]5 mins ago
It also has an = operator, which saves the whole expression. It re-evaluates it then every time it was used.
donatj [3 hidden]5 mins ago
Yep, CAS right in the BASIC on the 89 and up is a truly magical experience
remram [3 hidden]5 mins ago
For function calls too? List the arguments then the function's name?
dapperdrake [3 hidden]5 mins ago
Pipelining in software is covered by Richard C. Waters (1989a, 1989b). Wrangles this library to work with JavaScript. Incredibly effective. Much faster at writing and composing code. And this code executes much faster.
The one thing that I don’t like about pipelining (whether using a pipe operator or method chaining), is that assigning the result to a variable goes in the wrong direction, so to speak. There should be an equivalent of the shell’s `>` for piping into a variable as the final step. Of course, if the variable is being declared at the same time, whatever the concrete syntax is would still require some getting used to, being “backwards” compared to regular assignment/initialization.
AndyKluger [3 hidden]5 mins ago
FWIW in Factor you can set dynamic variables with
"coolvalue" thisisthevar set
or if you use the `variables` vocab, alternately:
"coolvalue" set: thisisthevar
and lexical variables are set with
"coolvalue" :> thisisthevar
jiggunjer [3 hidden]5 mins ago
Exists in R:
Mydata %>% myfun -> myresult
wavemode [3 hidden]5 mins ago
> At this point you might wonder if Haskell has some sort of pipelining operator, and yes, it turns out that one was added in 2014! That’s pretty late considering that Haskell exists since 1990.
The tone of this (and the entire Haskell section of the article, tbh) is rather strange. Operators aren't special syntax and they aren't "added" to the language. Operators are just functions that by default use infix position. (In fact, any function can be called in infix position. And operators can be called in prefix position.)
The commit in question added & to the prelude. But if you wanted & (or any other character) to represent pipelining you have always been able to define that yourself.
Some people find this horrifying, which is a perfectly valid opinion (though in practice, when working in Haskell it isn't much of a big deal if you aren't foolish with it). But at least get the facts correct.
pxc [3 hidden]5 mins ago
Maybe it's because I love the Unix shell environment so much, but I also really love this style. I try to make good use of it in every language I write code in, and I think it helps make my control flow very simple. With lots of pipelines, and few conditionals or loops, everything becomes very easy to follow.
stuaxo [3 hidden]5 mins ago
It's part of why JQuery was so great, and the Django ORM.
flakiness [3 hidden]5 mins ago
After seeing LangChain abusing the "|" operator overload for pipeline-like DSL, I followed the suite at work and I loved it. It's especially good when you use it in a notebook environment where you literally build the pipeline incrementally through repl.
shae [3 hidden]5 mins ago
If Python object methods returned `self` by default instead of `None` you could do this in Python too!
Interestingly though the actual integrated query part is much less useful or widely used as the methods on IEnumerable etc.
jesse__ [3 hidden]5 mins ago
I've always wondered why more languages don't do this. It just makes sense
raggi [3 hidden]5 mins ago
> (This is not real Rust code. Quick challenge for the curious Rustacean, can you explain why we cannot rewrite the above code like this, even if we import all of the symbols?)
and you can because it's lazy, which is also the same reason you can write it the other way.. in rust. I think the author was getting at an ownership trap, but that trap is avoided the same way for both arrangements, the call order is the same in both arrangements. If the calls were actually a pipeline (if collect didn't exist and didn't need to be called) then other considerations show up.
Mond_ [3 hidden]5 mins ago
Guilty as charged, I did not know about the `import_trait_associated_functions` feature at the time. I might add a note to the article to clarify this.
dpc_01234 [3 hidden]5 mins ago
I think there's a language syntax to be invented that would make everything suffix/pipeline-based. Stack based languages are kind of there, but I don't think exactly the same thing.
Is pipelining the right term here? I've always used the term "transducer" to describe this kind of process, I picked it up from an episode of FunFunFunction if I'm not mistaken.
Mond_ [3 hidden]5 mins ago
There's like 5 terms for this in different programming languages. I think 'pipelining' is the best universal word. 'Method chaining' just isn't correct, nor is 'builder pattern', and 'transducer' or 'thrush combinator' is obviously a nonstarter for most people.
Also, does the name matter if it works the same and has the same properties?
Maybe the author called it "pipelines" to avoid functional purists from nitpicking it.
_heimdall [3 hidden]5 mins ago
Yeah I do think of transducers as a functional paradigm, I read the article as describing a very functional paradigm as well.
In the context of a specific programming language feature it seems like terminology would be important, I wasn't trying to nitpick unintentionally.
alganet [3 hidden]5 mins ago
It is important for the functional guys, and I recognize the importance it has for them.
These "pipelines" and "object streaming" APIs are often built upon OOP. I feel that calling it "transducers" would offend the sensibilities of those who think it must be functional all the way down.
Don't you think it's better to keep it with a different name? I mean, even among the functional community itself there seems to be a lot of stress around purity, why would anyone want to make it worse?
_heimdall [3 hidden]5 mins ago
I may have just misunderstood the OP. It sounded to me like describing the benefits specifically of transducers, but if it was OOP and more just about piping operators or chaining the term wouldn't fit.
taeric [3 hidden]5 mins ago
A thing I really like about pipelines in shell scripts, is all of the buffering and threading implied by them. Semantically, you can see what command is producing output, and what command is consuming it. With some idea of how the CPU will be split by them.
This is far different than the pattern described in the article, though. Small shame they have come to have the same name. I can see how both work with the metaphor; such that I can't really complain. The "pass a single parameter" along is far less attractive to me, though.
TrianguloY [3 hidden]5 mins ago
Kotlin sort of have it with let (and run)
a().let{ b(it) }.let{ c(it) }
hombre_fatal [3 hidden]5 mins ago
Yeah, Kotlin's solution is nice because it's so general: you can chain on to anything instead of needing everyone to implement a builder pattern.
And it's already idiomatic unlike bolting a pipeline operator onto a language that didn't start with it.
jillesvangurp [3 hidden]5 mins ago
If you see somebody using a builder in Kotlin, they're basically doing it wrong. You can usually get rid of that stuff with a 1 line extension function (for example if it's some Java API that's being called).
// extension function on Foo.Companion (similar to static class function in Java)
fun Foo.Companion.create(block: FooBuilder.() -> Unit): Foo =
FooBuilder().apply(block).build()
// example usage
val myFoo = Foo.create {
setSomeproperty("foo")
setAnotherProperty("bar")
}
Works for any Java/Kotlin API that forces you into method chaining and calling build() manually. Also works without extension functions. You can just call it fun createAFoo(..) or whatever. Looking around in the Kotlin stdlib code base is instructive. Lots of little 1/2 liners like this.
true_blue [3 hidden]5 mins ago
That new Rhombus language that was featured here recently has an interesting feature where you can use `_` in a function call to act as a "placeholder" for an argument. Essentially it's an easy way to partially apply a function. This works very well with piping because it allows you to pipe into any argument of a function (including optional arguments iirc) rather than just the first like many pipe implementations have. It seems really cool!
To one up this: Of course it is even better, if your language allows you to implement proper pipelining with implicit argument passing by yourself. Then the standard language does not need to provide it and assign meaning to some symbols for pipelining. You can decide for yourself what symbols are used and what you find intuitive.
Pipelining can guide one to write a bit cleaner code, viewing steps of computation as such, and not as modifications of global state. It forces one to make each step return a result, write proper functions. I like proper pipelining a lot.
Mond_ [3 hidden]5 mins ago
> if your language allows you to implement proper pipelining with implicit argument passing by yourself
> You can decide for yourself what symbols are used and what you find intuitive
i mean this sounds fun
but tbh it also sounds like it'd result in my colleague Carl defining an utterly bespoke DSL in the language, and using it to write the worst spaghetti code the world has ever seen, leaving the code base an unreadable mess full of sharp edges and implicit behavior
0x1ceb00da [3 hidden]5 mins ago
Even veterans mess things up if you use too much of these exotic syntaxes. For loops and if statements rock, but they aren't cool and functional so they aren't discussed much.
chewbacha [3 hidden]5 mins ago
Is this pipelining or the builder pattern?
meltyness [3 hidden]5 mins ago
Pipes and filters are considered an architectural pattern, whereas Builder is a GoF OOP pattern, so yes.
ivanjermakov [3 hidden]5 mins ago
I usually call it method chaining. Where the builder pattern use it.
Mond_ [3 hidden]5 mins ago
"These are the same picture." (Sort of.)
Weryj [3 hidden]5 mins ago
LINQ was my gateway drug into functional programming, Pipelining is so beautiful.
1899-12-30 [3 hidden]5 mins ago
You can somewhat achieve a pipelined like system in sql by breaking down your steps into multiple CTEs. YMMV on the performance though.
infogulch [3 hidden]5 mins ago
Yeah, the way to get logical pipelining in SQL without CTEs is nested subqueries in the FROM clause. Unfortunately, the nesting is syntactically ugly and confusing to read which is basically the whole idea behind pipeline syntax.
amelius [3 hidden]5 mins ago
Am I the only one who thinks yuck?
Instead of writing: a().b().c().d(), it's much nicer to write: d(c(b(a()))), or perhaps (d ∘ c ∘ b ∘ a)().
vinceguidry [3 hidden]5 mins ago
Why, if you don't have to, would you write the functions in reverse order of when they're applied?
gloxkiqcza [3 hidden]5 mins ago
Presumably because they’ve been doing so for decades so it seems logical and natural in their head while the new thing is new and thus unintuitive.
tgv [3 hidden]5 mins ago
It's a bit snarky, but would you rather write FORTH then? So instead of
line.start, line.end |> draw;
i, a, b |> div |> round |> print;
sph [3 hidden]5 mins ago
Still too much syntax for a Forth. It should be
line.start line.end draw
i a b div round print
nh23423fefe [3 hidden]5 mins ago
function application is right associative?
dboreham [3 hidden]5 mins ago
Because it makes more sense?
kortex [3 hidden]5 mins ago
I would wager within a rounding error, all humans have a lifetime of experience in following directions of the form:
1. do the first step in the process
2. then do the next thing
3. followed by a third action
I struggle to think of any context outside of programming, retrosynthesis in chemistry, and some aspects of reverse-Polish notation calculators, where you conceive of the operations/arguments last-to-first. All of which are things typically encountered pretty late in one's educational journey.
amelius [3 hidden]5 mins ago
Consistency is more important. If you ever wrote:
a(b())
then you're already breaking your left-to-right/first-to-last rule.
ndriscoll [3 hidden]5 mins ago
There are some math books out there that use (x)f. My understanding is (some) algebraists tried to make it a thing ~60 years ago but it never really caught on.
hollerith [3 hidden]5 mins ago
There are, but they leave out the parens and use context to distinguish function application from multiplication.
regular_trash [3 hidden]5 mins ago
This is a foolish consistency, and a contrived counterexample. Consistency is not an ideal unto itself.
Mond_ [3 hidden]5 mins ago
Does it actually make more sense, or is it just more familiar?
duped [3 hidden]5 mins ago
A subtlety that I think many people overlook is that putting function application in lexicographical order means that tools can provide significantly better autocomplete results without needing to add a magic keybinding.
queuebert [3 hidden]5 mins ago
It probably wouldn't hurt for languages to steal more ideas from APL.
archargelod [3 hidden]5 mins ago
Add a couple more arguments to each function and you'll get that first variant is a lot nicer:
a(axe).b(baz, bog).c(cid).d(dot)
vs
d(c(b(a(axe), baz, bog), cid), dot)
adzm [3 hidden]5 mins ago
When a b c d are longer expressions, the pipeline version looks more readable especially when split on multiple lines since it only has one level of indentation and you don't have to think about the number of parentheses at the end.
ZYbCRq22HbJ2y7 [3 hidden]5 mins ago
Its nice sugar, but pretty much any modern widely used language supports "pipelining", just not of the SML flavor.
bcoates [3 hidden]5 mins ago
Why is the SQL syntax so unnecessarily convoluted? SQL is already an operator language, just an overly constrained one due to historical baggage. If you're going to allow new syntax at all, you can just do
from customer
left join orders on c_custkey = o_custkey and o_comment not like '%unusual%'
group by c_custkey
alias count(o_orderkey) as count_of_orders
group by count_of_orders
alias count(*) as count_of_customers
order by count_of_customers desc
select count_of_customers, count_of_orders;
I'm using 'alias' here as a strawman keyword for what the slide deck calls a free-standing 'as' operator because you can't reuse that keyword, it makes the grammar a mess.
The aliases aren't really necessary, you could just write the last line as 'select count(count(*)) ncust, count(*) nord' if you aren't afraid of nested aggregations, and if you are you'll never understand window functions, soo...
The |> syntax adds visual noise without expressive power, and the novelty 'aggregate'/'call' operators are weird special-case syntax for something that isn't that complex in the first place.
The implicit projection is unnecessary too, for the same reason any decent SQL linter will flag an ambiguous 'select *'
zeroimpl [3 hidden]5 mins ago
I think they are solving two different problems at the same time. One is the order of elements in a single operation (SELECT then FROM then WHERE etc), and the second is the actual pipelining which replaces the need for nested queries.
It does seem like the former could be solved by just loosening up the grammar to allow you to specify things in any order. Eg this seems perfectly unambiguous:
from customer
group by c_custkey
select c_custkey, count(*) as count_of_customers
bcoates [3 hidden]5 mins ago
Yeah, exactly. You don't need literal pipes
drchickensalad [3 hidden]5 mins ago
I miss F#
aloisdg [3 hidden]5 mins ago
So do I sibling. so do I
kuon [3 hidden]5 mins ago
That's also why I enjoy elixir a lot.
The |> operator is really cool.
bluSCALE4 [3 hidden]5 mins ago
Same. The sad part is that pipelining seems to be something AI is really good at so I'm finding myself writing less of it.
joeevans1000 [3 hidden]5 mins ago
Clojure threading, of course.
relaxing [3 hidden]5 mins ago
These articles never explain what’s wrong with calling each function separately and storing each return value in an intermediate variable.
Being able to inspect the results of each step right at the point you’ve written it is pretty convenient. It’s readable. And the compiler will optimize it out.
jaymbo [3 hidden]5 mins ago
This is why I love Scala so much
rad_gruchalski [3 hidden]5 mins ago
Scala is by far one of the nicest programming languages I have ever worked with. Scala with no JVM dependency would a killer programming language BUT only when all async features work out of the box like they do JVM. It’s been attempted a couple of times and it never succeeded.
> allows you to omit a single argument from your parameter list, by instead passing the previous value
I have no idea what this is trying to say, or what it has to do with the rest of the article.
delusional [3 hidden]5 mins ago
It's getting at the essential truth that for all(?) mainstream languages since object orientation and the dot syntax became a thing `a.b()` implicitly includes `a` as the first argument to the actual method `b(a self)`. Different languages have different constructs on top of that, C++ for example includes a virtual dispatch mechanism, but the one common idea of the _method call_ is that the `self` pointer is passed as the first argument.
jiggawatts [3 hidden]5 mins ago
PowerShell has the best pipeline capability of any language I have ever seen.
For comparison, UNIX pipes support only trivial byte streams from output to input.
PowerShell allows typed object streams where the properties of the object are automatically wired up to named parameters of the commands on the pipeline.
Outputs at any stage can not only be wired directly to the next stage but also captured into named variables for use later in the pipeline.
Every command in the pipeline also gets begin/end/cancel handlers automatically invoked so you can set up accumulators, authentication, or whatever.
UNIX scripting advocates don’t know what they’re missing out on…
account-5 [3 hidden]5 mins ago
You should look at Nushell. Much as I like powershell, Nushell just seems better.
Yeah, it's younger than powershell so there's some rough edges in some places. I wouldn't let empty headings in the docs stop you using/experimenting with it though.
I use powershell daily but am hopeful that I can replace it with Nushell at some point.
jmyeet [3 hidden]5 mins ago
Hack (Facebook's PHP fork) has this feature. It's called pipes [1]:
$x = vec[2,1,3]
|> Vec\map($$, $a ==> $a * $a) // $$ with value vec[2,1,3]
|> Vec\sort($$); // $$ with value vec[4,1,9]
It is a nice feature. I do worry about error reporting with any feature that combines multiple statements into a single statement, which is essentially what this does. In Java, there was always an issue with NullPointerExceptiosn being thrown and if you chain several things together you're never sure which one was null.
Wait. Isn't that already solved in Java? Optional, Mono, Flux, etc.
I remember being able to deal with object streams with it quite comfortably.
jmyeet [3 hidden]5 mins ago
Any function that can return an object in Java can return a null. Nullability not being part of the type system I think is a design fail.
alganet [3 hidden]5 mins ago
Yeah, so the null checks become annoying if you use only language fundamentals.
Java has a culture of having a layer above fundamentals.
We're past all that already. I am discussing the ergonomics of their null checking APIs, particularly in the context of pipelining (or streaming, in the Java world).
I find them quite comfortable.
XorNot [3 hidden]5 mins ago
Every example of why this is meant to be good is contrived.
You have a create_user function that doesn't error? Has no branches based on type of error?
We're having arguments over the best way break these over multiple lines?
Like.. why not just store intermediate results in variables? Where our branch logic can just be written inline? And then the flow of data can be very simply determined by reading top to bottom?
wslh [3 hidden]5 mins ago
I also like a syntax that includes pipelining parallelization, for example:
A
.B
.C
|| D
|| E
regular_trash [3 hidden]5 mins ago
Wouldn't this complicate variable binding? I'm unsure how to think about this kinda of syntax if either D or E are expected to return some kind of data instead of "fire and forget" processes.
tpoacher [3 hidden]5 mins ago
pipelines are great IF you can easily debug them as easily as temp variable assignments
... looking at you R and tidyverse hell.
guerrilla [3 hidden]5 mins ago
This is just super basic functional programming. Seems like we're taking the long way around...
Mond_ [3 hidden]5 mins ago
Have you read the article? This isn't about functional vs. imperative programming, it's (if anything) about two different ways to write functional code.
guerrilla [3 hidden]5 mins ago
Keywords "super basic". You learn this in a "my first Haskell" tutorials. Seems tortured in whatever language that is though.
jongjong [3 hidden]5 mins ago
Pipelining is great. Currying is horrible. Though currying superficially looks similar to pipelining.
One difference is that currying returns an incomplete result (another function) which must be called again at a later time. On the other hand, pipelining usually returns raw values. Currying returns functions until the last step. The main philosophical failure of currying is that it treats logic/functions as if they were state which should be passed around. This is bad. Components should be responsible for their own state and should just talk to each other to pass plain information. State moves, logic doesn't move. A module shouldn't have awareness of what tools/logic other modules need to do their jobs. This completely breaks the separation of concerns principle.
When you call a plumber to fix your drain, do you need to provide them with a toolbox? Do you even need to know what's inside their toolbox? The plumber knows what tools they need. You just show them what the problem is. Passing functions to another module is like giving a plumber a toolbox which you put together by guessing what tools they might need. You're not a plumber, why should you decide what tools the plumber needs?
Currying encourages spaghetti code which is difficult to follow when functions are passed between different modules to complete the currying. In practice, if one can design code which gathers all the info it needs before calling the function once; this leads to much cleaner and much more readable code.
blindseer [3 hidden]5 mins ago
This article is great, and really distills why the ergonomics of Rust is so great and why languages like Julia are so awful in practice.
jakobnissen [3 hidden]5 mins ago
You mean tab completion in Rust? Otherwise, let me introduce you to:
imap(f) = x -> Iterators.map(f, x)
ifilter(f) = x -> Iterators.filter(f, x)
v = things |>
ifilter(isodd) |>
imap(do_process) |>
collect
Compare with a simple pipeline in bash:
Each of those components executes in parallel, with the intermediate results streaming between them. You get a similar effect with coroutines.Compare Ruby:
In that case, each line is processed sequentially, with a complete array being created between each step. Nothing actually gets pipelined.Despite being clean and readable, I don't tend to do it any more, because it's harder to debug. More often these days, I write things like this:
It's ugly, but you know what? I can set a breakpoint anywhere and inspect the intermediate states without having to edit the script in prod. Sometimes ugly and boring is better.The inventor of the shell pipeline, Douglas McIlroy, always understood the equivalency between pipelines and coroutines; it was deliberate. See https://www.cs.dartmouth.edu/~doug/sieve/sieve.pdf It goes even deeper than it appears, too. The way pipes were originally implemented in the Unix kernel was when the pipe buffer was filled[1] by the writer the kernel continued execution directly in the blocked reader process without bouncing through the scheduler. Effectively, arguably literally, coroutines; one process call the write function and execution continues with a read call returning the data.
Interestingly, Solaris Doors operate the same way by design--no bouncing through the scheduler--unlike pipes today where long ago I think most Unix kernels moved away from direct execution switching to better support multiple readers, etc.
[1] Or even on the first write? I'd have to double-check the source again.
Java streams are the closest equivalent, both by the concurrent execution model, and syntactically. And yes, the Java debugger can show you the state of the intermediate streams.
Something like
is a hell to read and understand later imo. You have to read a lot of intermediate variables that do not matter in anything else in the code after you set it up, but you do not know in advance necessarily which matter and which don't unless you read and understand all of it. Also, it pollutes your workspace with too much stuff, so while this makes it easier to debug, it makes it also harder to read some time after. Moreover becomes even more crumpy if you need to repeat code. You probably need to define a function block then, which moves the crumpiness there.What I do now is starting defining the transformation in each step as a pure function, and chain them after once everything works, plus enclosing it into an error handler so that I depend on breakpoint debugging less.
There is certainly a trade off, but as a codebase grows larger and deals with more cases where the same code needs to be applied, the benefits of a concise yet expressive notation shows.
Bit annoying, but serviceable. Though there's nothing wrong with your approach either.
https://gist.github.com/user-attachments/assets/3329d736-70f...
https://peps.python.org/pep-0289/
I believe the correct definition for this concept is the Thrush combinator[0]. In some ML-based languages[1], such as F#, the |> operator is defined[2] for same:
Other functional languages have libraries which also provide this operator, such as the Scala Mouse[3] project.0 - https://leanpub.com/combinators/read#leanpub-auto-the-thrush
1 - https://en.wikipedia.org/wiki/ML_(programming_language)
2 - https://fsharpforfunandprofit.com/posts/defining-functions/
3 - https://github.com/typelevel/mouse?tab=readme-ov-file
Unless I misunderstood the author, because method chaining is super common where I feel thrush operators are pretty rare, I would be surprised if they meant the latter.
Other than that I think both styles are fine.
Another problem of having different names for each step is that you can no longer quickly comment out a single step to try things out, which you can if you either have the pipeline or a single variable name.
Given File.readlines("haystack.txt"), the entire file must be resident in memory before .grep(/needle/) is performed, which may cause unnecessary utilization. Iirc, in frameworks like Polars, the collect() chain ending method tells the compiler that the previous methods will be performed as a stream and thus not require pulling the entirety into memory in order to perform an operation on a subset of the corpus.
I myself agree, and find myself doing that too, especially in frontend code that executes in a browser. Debuggability is much more important than marginally-better readability, for production code.
This helped me quickly develop a sense for how code is optimized and what code is eventually executed.
But, yes, naive call chaining like that is sometimes a significant performance problem in the real world. For example, in the land of JavaScript. One of the more egregious examples I've personally seen was a Bash script that used Bash arrays rather than pipelines, though in that case it had to do with the loss of concurrency, not data churn.
the chaining really only works if your language is strongly typed and you are somewhat guaranteed that variables will be of expected type.
I do it in a similar way you mentioned
But with actually checked in code, the tradeoff in readability is pretty substantial
Processes run in parallel, but they process the data in a strict sequential order: «grep» must produce a chunk of data before «sed» can proceed, and «sed» must produce another chunk of data before «xargs» can do its part. «xargs» in no way can ever pick up the output of «grep» and bypass the «sed» step. If the preceding step is busy crunching the data and is not producing the data, the subsequent step will be blocked (the process will fall asleep). So it is both, a pipeline and a chain.
It is actually a directed data flow graph.
Also, if you replace «haystack.txt» with a /dev/haystack, i.e.
and /dev/haystack is waiting on the device it is attached to to yield a new chunk of data, all of the three, «grep», «sed» and «xargs» will block.However.
I would be lying if I didn't secretly wish that all languages adopted the `|>` syntax from Elixir.
```
params
|> Map.get("user")
|> create_user()
|> notify_admin()
```
The problem is that method-chaining is common in several OO languages, including Ruby. This means the functions on an object return an object, which can then call other functions on itself. In contrast, the pipe operator calls a function, passing in what's on the left side of it as the first argument. In order to work properly, this means you'll need functions that take the data as the first argument and return the same shape to return, whether that's a list, a map, a string or a struct, etc.
When you add a pipe operator to an OO language where method-chaining is common, you'll start getting two different types of APIs and it ends up messier than if you'd just stuck with chaining method calls. I much prefer passing immutable data into a pipeline of functions as Elixir does it, but I'd pick method chaining over a mix of method chaining and pipelines.
https://github.com/tc39/proposal-pipeline-operator
I'm very excited for it.
Our only hope is if TypeScript finally gives up on the broken TC39 process and starts to implement its own syntax enhancements again.
[1] https://2024.stateofjs.com/en-US/usage/#top_currently_missin...
More specifically, with the (also ironically gummed up in tc39) type syntax [1], and importantly node introducing the --strip-types option [2], TS is only ever going to look more and more like standards compliant JS.
[1] https://tc39.es/proposal-type-annotations/
[2] https://nodejs.org/en/blog/release/v22.6.0
It seems like most people are just asking for the simple function piping everyone expects from the |> syntax, but that doesn't look likely to happen.
I'm not a JS dev so idk what member property support is.
This also gets weird because if the `|>` is a special function that sends in a magic `%` parameter, it'd have to be context sensitive to whether or not an `async` thing happens within the bounds. Whether or not it does will determine if the subsequent pipes are dealing with a future of % or just % directly.
In reality it would look like:
(assuming getFuture and bat are both async). You do need |> to be aware of the case where the await keyword is present, but that's about it. The above would effectively transform to: I don't see the problem with this.Typically in JS you do this with parens like so:
(await getFutureAsyncFactory())("input")
But the use of parens doesn't transpose to the pipeline setting well IMO
Given this example:
the getFutureAsyncFactory function is async, but the function it returns is not (or it may be and we just don't await it). Basically, using |> like you stated above doesn't do what you want. If you wanted the same semantics, you would have to do something like: to invoke the returned function.The whole pipeline takes on the value of the last function specified.
But the latter is syntactically undistinguishable from
What do you think about Where you don't really call the function with () in the pipeline syntax? I think that would be more natural.For the last example, it would look like:
assuming f() is still async and returns a function. g() must be a function, so the parenthesis have to be added.They list this as a con of F# (also Elixir) pipes:
The insistence on an arrow function is pure hallucination Should be perfectly achievable as it is in these other languages. What’s more, doing so removes all of the handwringing about await. And I’m frankly at a loss why you would want to put yield in the middle of one of these chains instead of after.I haven't looked at the member properties bits but I suspect the pipeline syntax just needs the transform to be supported in build tools, rather than adding yet another polyfill.
Note also that it works well in Elixir because it was created at the same time as most of the standard library. That means that the standard library takes the relevant argument in the first position all the time. Very rarely do you need to pipe into the second argument (and you need a lambda or convenience function to make that work).
``` params.get("user") |> create_user |> notify_admin ```
Even more concise and it doesn't even require a special language feature, it's just regular syntax of the language ( |> is a method like .get(...) so you could even write `params.get("user").|>(create_user) if you wanted to)
Also, what if the function you want to use is returned by some nullary function? You couldn't just do |> getfunc(), as presumably the pipeline operator will interfere with the usual meaning of the parentheses and will try to pass something to getfunc. Would |> ( getfunc() ) work? This is the kind of problem that can arise when one language feature is permitted to change the ordinary behaviour of an existing feature in the name of convenience. (Unless of course I'm just missing something.)
I just find this syntax too inconsistent and vague, and hence actually annoying. Which is why I prefer defining pipes as composition of functions which can then be applied to whatever data. Then eg one can write sth like `(|> foo1 foo2 (foo3) #(foo4 % y))` and know that foo1 and foo2 are references to functions, foo3 evaluates to another function, and when one needs more arguments in foo4 they have to explicitly state that. This gives another function, and there is no ambiguity here whatsoever.
That's actually true. In Scala that is not so nice, because then it becomes `x |> foo(_, arg2)` or, even worse, `x |> (param => foo(param, arg2))`. I have a few such cases in my sourcecode and I really don't like it. Haskell and PureScript do a much better job keeping the code clean in such cases.
I agree with that and it confused me that it looks like the function is not referenced but actually applied/executed.
In Elixir, it is just a macro so it applies to all functions. I'm only a Scala novice so I'm not sure how it would work there.
Yes exactly, which is why it is not equivalent. No macro needed here. In Scala 2 syntax:
``` implicit class AnyOps[A](private val a: A) extends AnyVal { def |>[B](f: A => B) = f(a) } ```
This is usually the Thrush combinator[0], exists in other languages as well, and can be informally defined as:
0 - https://leanpub.com/combinators/read#leanpub-auto-the-thrush[1] https://tour.gleam.run/functions/pipelines/
Is there any language with a single feature that gives the best of both worlds?
```
params
|> Map.get("user")
|> create_user()
|> (¬ify_admin("signup", &1)).() ```
or
```
params
|> Map.get("user")
|> create_user()
|> (fn user -> notify_admin("signup", user) end).() ```
(No disagreements with your post, just want to give credit where it's due. I'm also a big fan of the syntax)
My dislike does improve my test coverage though, since I tend to pop out a real method instead.
https://github.com/rplevy/swiss-arrows https://github.com/hipeta/arrow-macros
Instead of:
```
fetch_data()
|> (fn
end).()|> String.upcase()
```
Something like this:
```
fetch_data()
|>? {:ok, val, _meta} -> val
|>? :error -> "default value"
|> String.upcase()
```
This is for sequential conditions. If you have nested conditions, check out a where block instead. https://dev.to/martinthenth/using-elixirs-with-statement-5e3...
In my programming language, I added `.>` as a reverse-compose operator, so pipelines of function compositions can also be read uniformly left-to-right, e.g.
There is also https://hackage.haskell.org/package/flow which uses .> and <. for function composition.
EDIT: in no way do I want to claim the originality of these things in Elm or the Haskell package inspired by it. AFAIK |> came from F# but it could be miles earlier.
For example, we can write: (foo (bar (baz x))) as (-> x baz bar foo)
If there are additional arguments, we can accommodate those too: (sin (* x pi) as (-> x (* pi) sin)
Where expression so far gets inserted as the first argument to any form. If you want it inserted as the last argument, you can use ->> instead:
(filter positive? (map sin x)) as (->> x (map sin) (filter positive?))
You can also get full control of where to place the previous expression using as->.
Full details at https://clojure.org/guides/threading_macros
That's sort of an argument for the existence of macros as a whole, you can't really do this as neatly in something like python (although I've tried) - I can see the downside of working in a codebase with hundreds of these kind of custom language features though.
I use these with xforms transducers.
https://github.com/johnmn3/injest
One day, we'll (re)discover that partial application is actually incredibly useful for writing programs and (non-Haskell) languages will start with it as the primitive for composing programs instead of finding out that it would be nice later, and bolting on a restricted subset of the feature.
Recently I started using Nushell, which feels very similar.
And for exceptions, why not solve it in the data model, and reify failures? Push it further downstream, let your pipeline's nodes handle "monadic" result values.
Point being, it's always a tradeoff, but you can usually lessen the pain more than you think.
And that's without mentioning that a lot of "pipelining" is pure sugar over the same code we're already writing.
Programming should be focused on the happy path. Much of the syntax in primitive languages concerning exceptions and other early returns is pure noise.
Exception handing is only a problem in languages that use exceptions. Fortunately there are many modern alternatives in wide use that don't use exceptions.
I've encountered and used this pattern in Python, Ruby, Haskell, Rust, C#, and maybe some other languages. It often feels nice to write, but reading can easily become difficult -- especially in Haskell where obscure operators can contain a lot of magic.
Debugging them interactively can be equally problematic, depending on the tooling. I'd argue, it's commonly harder to debug a pipeline than the equivalent imperative code and, that in the best case it's equally hard.
If you need to handle an unhappy path in a way that isn’t optimal for nested function calls then you shouldn’t be nesting your function calls. Pipelining doesn’t magically make things easier nor harder in that regard.
But if a particular sequence of function calls do suit nesting, then pipelining makes the code much more readable because you’re not mixing right-to-left syntax (function nests) with left-to-right syntax (ie you’re typical language syntax).
Crudely put, in C-like languages, pipelining is just as way of turning
Where the first function call is in the inner, right-most, parentheses,into this:
…which can be easily read sequentially from left-to-right.The old adage of not writing code so smart you can’t debug it applies here.
Pipelining runs contrary enough to standard imperative patterns. You don’t just need a new mindset to write code this way. You need to think differently about how you structure your code overall and you need different tools.
That’s not to say that doing things a different way isn’t great, but it does come with baggage that you need to be in a position to carry.
Rust chains everything because of this. It's often unpleasant (see: all the Rust GUI toolkits).
Not that it matters much, the switching costs are immense. Getting people able to teach it would be impossible, and collaboration with people taught in the other system would be horrible. I am doubtful I could make the switch, even if I wanted.
The major advantage is that handling multiple streams is natural. Suppose you want to compute the dot product of two files where each line contains a float:
Because I love to practice and demonstrate Factor, this is working code for that example:
Pipelining can become hard to debug when chains get very long. The author doesn't address how hard it can be to identify which step in a long chain caused an error.
They do make fun of Python, however. But don't say much about why they don't like it other than showing a low-res photo of a rock with a pipe routed around it.
Ambiguity about what constitutes "pipelining" is the real issue here. The definition keeps shifting throughout the article. Is it method chaining? Operator overloading? First-class functions? The author uses examples that function very differently.
Yeah, I agree that this can be problem when you lean heavily into monadic handling (i.e. you have fallible operations and then pipe the error or null all the way through, losing the information of where it came from).
But that doesn't have much to do with the article: You have the same problem with non-pipelined functional code. (And in either case, I think that it's not that big of a problem in practice.)
> The author uses examples that function very differently.
Yeah, this is addressed in one of the later sections. Imo, having a unified word for such a convenience feature (no matter how it's implemented) is better than thinking of these features as completely separate.
It's not like you lose that much readability from
But you could in many cases easily infer from the execution plan what a query would look like and fetch an intermediate set separately.
[0] https://www.flow-storm.org/
>Let me make it very clear: This is [not an] article it's a hot take about syntax. In practice, semantics beat syntax every day of the week. In other words, don’t take it too seriously.
Just before that statement, he says that it is an article/hot take about syntax. He acknowledges your point.
So I think when he says "semantics beat syntax every day of the week", that's him acknowledging that while he prefers certain syntax, it may not be the best for a given situation.
Building pipelines:
https://effect.website/docs/getting-started/building-pipelin...
Using generators:
https://effect.website/docs/getting-started/using-generators...
Having both options is great (at the beginning effect had only pipe-based pipelines), after years of writing effect I'm convinced that most of the time you'd rather write and read imperative code than pipelines which definitely have their place in code bases.
In fact most of the community, at large, converged at using imperative-style generators over pipelines and having onboarded many devs and having seen many long-time pipeliners converging to classical imperative control flow seems to confirm both debugging and maintenance seem easier.
In fact, I always thought it would be a good idea for all statement blocks (in any given programming language) to allow an implicit reference to the value of the previous statement. The pipeline operation would essentially be the existing semicolons (in a C-like language) and there would be a new symbol or keyword used to represent the previous value.
For example, the MATLAB REPL allows for referring to the previous value as `ans` and the Julia REPL has inherited the same functionality. You can copy-paste this into the Julia REPL today:
You can't use this in Julia outside the REPL, and I don't think `ans` is a particularly good keyword for this, but I honestly think the concept is good enough. The same thing in JavaScript using `$` as an example: I feel it would work best with expression-based languages having blocks that return their final value (like Rust) since you can do all sorts of nesting and so-on.I think most interactive programming shells has an equivalent.
No longer do we have to explain that expressions are evaluated in the order of FROM -> JOIN -> ON -> SELECT -> WHERE -> GROUP BY -> HAVING -> ORDER BY -> LIMIT (and yes, I know I'm missing several other steps). We can simply just express how our data flows from one statement to the next.
(I'm also stating this as someone who has yet to play around with the pipelining syntax, but honestly anything is better than the status quo.)
It is an exemplar of expressions [0] more than anything else, which have little to do with the idea of passing results from one method to another.
[0]: https://learn.microsoft.com/en-us/dotnet/csharp/language-ref...
fn get_ids(data: Vec<Widget>) -> Vec<Id> { data.iter() // get iterator over elements of the list .filter(|w| w.alive) // use lambda to ignore tombstoned widgets .map(|w| w.id) // extract ids from widgets .collect() // assemble iterator into data structure (Vec) }
Same thing in 15 year old C# code.
List<Guid> GetIds(List<Widget> data)
{
In this case I would say extension methods are what he's really referring to, of which Linq to objects is built on top of.
1) The method chaining extension methods on IEnumerable<T> like Select, Where, GroupBy, etc. This is identical to the rust example in the article.
2) The weird / bad (in my opinion) language keywords analogous to the above such as "from", "where", "select" etc.
[1]: https://learn.microsoft.com/en-us/dotnet/csharp/linq/get-sta...
[1] https://prql-lang.org/
Julia's multiple dispatch means that all arguments to a function are treated equally. The syntax `b(a, c)` makes this clear, whereas `a.b(c)` makes it look like `a` is in some way special.
I'm really want to start playing with some C++23 in the future.
https://www.youtube.com/watch?v=c1gfbbE2zts
Point-free style and pipelining were meant for each other. https://en.m.wikipedia.org/wiki/Tacit_programming
> Quick challenge for the curious Rustacean, can you explain why we cannot rewrite the above code like this, even if we import all of the symbols?
Probably for lack of
> weird operators like <$>, <*>, $, or >>=
Examples:
https://kotlinlang.org/docs/extensions.html
https://docs.scala-lang.org/scala3/reference/contextual/exte...
See also: https://en.wikipedia.org/wiki/Uniform_function_call_syntax
I wrote a little pipeline macro in https://nim-lang.org/ for Advent of Code years ago and as far as I know it worked okay.
``` import macros
```Makes me want to go write more nim.
And you'd lose all those cases of extension methods where the convenience of accepting null left of the dot is their sole reason to be. Null is a valid state, not something incredibly scary best dealt with with a full reboot or better yet throwing away the container. Kotlin is about making peace with null, instead of pretending that null does not exist. (yes, I'm looking at you, Scala)
What I do agree with is that extension methods should be a last ditch solution. I'd actually like to see a way to do the nullable receiver thing defined more like regular functions. Perhaps something like
that is defined inside the regular class block, imported like a regular method (as part of the class) and even present in the class object e.g. for reflection on the non-null case (and for Java compat where that still matters)> Null is a valid state, not something incredibly scary best dealt with with a full reboot or better yet throwing away the container. Kotlin is about making peace with null, instead of pretending that null does not exist. (yes, I'm looking at you, Scala)
I honestly find this to be such a weird thing to say or imply. No one is "scared" of null.
Agreed. It should be a first-class construct in a language with its own own proper type,
rather than needing to hitch a ride with the Integers and Strings like a second-class construct.I prefer to just generalize the function (make it generic, leverage traits/typeclasses) tbh.
> Probably for lack of > weird operators like <$>, <*>, $, or >>=
Nope btw. I mean, maybe? I don't know Haskell well enough to say. The answer that I was looking for here is a specific Rust idiosyncrasy. It doesn't allow you to import `std::iter::Iterator::collect` on its own. It's an associated function, and needs to be qualified. (So you need to write `Iterator::collect` at the very least.)
You probably noticed, but it should become a thing in RFC 3591: https://github.com/rust-lang/rust/issues/134691
So it does kind of work on current nightly:
Of course this really only matters when you're 25 minutes into critical downtime and a bug is hiding somewhere in these method chains. Anything that is surprising needs to go.
IMHO it would be better to set intermediate variables with dead simple names instead of newlines.
fn get_ids(data: Vec<Widget>) -> Vec<Id> {
Yeah, I agree. The problem is that you have to keep track of nesting in the middle of the expression and then unnest it at the end, which is taxing.
So, I also think it could also read better written like this, with the arguments reversed, so you don't have to read it both ways:
That's also what they do in Haskell. The first argument to map is the mapping function, the first argument to filter is the predicate function, and so on. People will often just write the equivalent of: as their function definitions, with the argument omitted because using the function composition operator looks neater than using a bunch of dollar signs or parentheses.Making it the second argument only makes sense when functions are written after their first argument, not before, to facilitate writing "foo.map(f).filter(y)".
Along with the `|>` operator (which is itself just a function that's conventionally infixed), this turns out to be really nice for flexibility/reusability. All of these programs do the same thing:
It was extremely satisfying to discover that with this encoding, `|>` is simply an identity function![0]: https://github.com/mkantor/please-lang-prototype
[1]: In reality variable dereferencing uses a sigil, but I'm omitting it from this comment to keep the examples focused.
See how adding line breaks still keeps the `|w| w.alive` very far from the `filter` call? And the `|w| w.id` very far from the `map` call?
If you don't have the pipeline operator, please at least format it something like this:
...which is still absolutely atrocious both to write and to read!Also see how this still reads fine despite being one line:
It's not about line breaks, it's about the order of applying the operations, and about the parameters to the operations you're performing.For me, it's both. Honestly, I find it much less readable the way you're split it up. The way I had it makes it very easy for me to read it in reverse; map, filter, map, collect
> Also see how this still reads fine despite being one line
It doesn't read fine, to me. I have to spend mental effort figuring out what the various "steps" are. Effort that I don't need to spend when they're split across lines.
For me, it's a "forest for the trees" kind of thing. I like being able to look at the code casually and see what it's doing at a high level. Then, if I want to see the details, I can look more closely at the code.
> You might think that this issue is just about trying to cram everything onto a single line, but frankly, trying to move away from that doesn’t help much. It will still mess up your git diffs and the blame layer.
Diff will still be terrible because adding a step will change the indentation of everything 'before it' (which, somewhat confusingly, are below it syntactically) in the chain.
x = iter(data);
y = filter(x, w=>w.isAlive);
z = map(y, w=>w.id);
return collect(z);
It doesn't need new syntax, but to implement this with the existing syntax you do have to figure out what the intermediate objects are, but you also have that problem with "pipelining" unless it compiles the whole chain into a single thing a la Linq.
https://datapad.readthedocs.io/en/latest/quickstart.html#ove...
I think the imperative style isn't as readable (of course I would), but that's absolutely a discussion for another day, and I get why people prefer it.
The APL family is similarly consistent, except RTL.
https://dspace.mit.edu/handle/1721.1/6035
https://dspace.mit.edu/handle/1721.1/6031
https://dapperdrake.neocities.org/faster-loops-javascript.ht...
The tone of this (and the entire Haskell section of the article, tbh) is rather strange. Operators aren't special syntax and they aren't "added" to the language. Operators are just functions that by default use infix position. (In fact, any function can be called in infix position. And operators can be called in prefix position.)
The commit in question added & to the prelude. But if you wanted & (or any other character) to represent pipelining you have always been able to define that yourself.
Some people find this horrifying, which is a perfectly valid opinion (though in practice, when working in Haskell it isn't much of a big deal if you aren't foolish with it). But at least get the facts correct.
This is my biggest complaint about Python.
Um, you can:
and you can because it's lazy, which is also the same reason you can write it the other way.. in rust. I think the author was getting at an ownership trap, but that trap is avoided the same way for both arrangements, the call order is the same in both arrangements. If the calls were actually a pipeline (if collect didn't exist and didn't need to be called) then other considerations show up.BTW. For people complaining about debug-ability of it: https://doc.rust-lang.org/std/iter/trait.Iterator.html#metho... etc.
Also, does the name matter if it works the same and has the same properties?
Maybe the author called it "pipelines" to avoid functional purists from nitpicking it.
In the context of a specific programming language feature it seems like terminology would be important, I wasn't trying to nitpick unintentionally.
These "pipelines" and "object streaming" APIs are often built upon OOP. I feel that calling it "transducers" would offend the sensibilities of those who think it must be functional all the way down.
Don't you think it's better to keep it with a different name? I mean, even among the functional community itself there seems to be a lot of stress around purity, why would anyone want to make it worse?
This is far different than the pattern described in the article, though. Small shame they have come to have the same name. I can see how both work with the metaphor; such that I can't really complain. The "pass a single parameter" along is far less attractive to me, though.
And it's already idiomatic unlike bolting a pipeline operator onto a language that didn't start with it.
Pipelining can guide one to write a bit cleaner code, viewing steps of computation as such, and not as modifications of global state. It forces one to make each step return a result, write proper functions. I like proper pipelining a lot.
i mean this sounds fun
but tbh it also sounds like it'd result in my colleague Carl defining an utterly bespoke DSL in the language, and using it to write the worst spaghetti code the world has ever seen, leaving the code base an unreadable mess full of sharp edges and implicit behavior
Instead of writing: a().b().c().d(), it's much nicer to write: d(c(b(a()))), or perhaps (d ∘ c ∘ b ∘ a)().
1. do the first step in the process
2. then do the next thing
3. followed by a third action
I struggle to think of any context outside of programming, retrosynthesis in chemistry, and some aspects of reverse-Polish notation calculators, where you conceive of the operations/arguments last-to-first. All of which are things typically encountered pretty late in one's educational journey.
a(b())
then you're already breaking your left-to-right/first-to-last rule.
The aliases aren't really necessary, you could just write the last line as 'select count(count(*)) ncust, count(*) nord' if you aren't afraid of nested aggregations, and if you are you'll never understand window functions, soo...
The |> syntax adds visual noise without expressive power, and the novelty 'aggregate'/'call' operators are weird special-case syntax for something that isn't that complex in the first place.
The implicit projection is unnecessary too, for the same reason any decent SQL linter will flag an ambiguous 'select *'
It does seem like the former could be solved by just loosening up the grammar to allow you to specify things in any order. Eg this seems perfectly unambiguous:
The |> operator is really cool.
Being able to inspect the results of each step right at the point you’ve written it is pretty convenient. It’s readable. And the compiler will optimize it out.
b) Async works on Scala Native: https://github.com/lampepfl/gears and is coming to Scala.js.
I have no idea what this is trying to say, or what it has to do with the rest of the article.
For comparison, UNIX pipes support only trivial byte streams from output to input.
PowerShell allows typed object streams where the properties of the object are automatically wired up to named parameters of the commands on the pipeline.
Outputs at any stage can not only be wired directly to the next stage but also captured into named variables for use later in the pipeline.
Every command in the pipeline also gets begin/end/cancel handlers automatically invoked so you can set up accumulators, authentication, or whatever.
UNIX scripting advocates don’t know what they’re missing out on…
I use powershell daily but am hopeful that I can replace it with Nushell at some point.
[1]: https://docs.hhvm.com/hack/expressions-and-operators/pipe
I remember being able to deal with object streams with it quite comfortably.
Java has a culture of having a layer above fundamentals.
We're past all that already. I am discussing the ergonomics of their null checking APIs, particularly in the context of pipelining (or streaming, in the Java world).
I find them quite comfortable.
You have a create_user function that doesn't error? Has no branches based on type of error?
We're having arguments over the best way break these over multiple lines?
Like.. why not just store intermediate results in variables? Where our branch logic can just be written inline? And then the flow of data can be very simply determined by reading top to bottom?
A
.B
.C
... looking at you R and tidyverse hell.
One difference is that currying returns an incomplete result (another function) which must be called again at a later time. On the other hand, pipelining usually returns raw values. Currying returns functions until the last step. The main philosophical failure of currying is that it treats logic/functions as if they were state which should be passed around. This is bad. Components should be responsible for their own state and should just talk to each other to pass plain information. State moves, logic doesn't move. A module shouldn't have awareness of what tools/logic other modules need to do their jobs. This completely breaks the separation of concerns principle.
When you call a plumber to fix your drain, do you need to provide them with a toolbox? Do you even need to know what's inside their toolbox? The plumber knows what tools they need. You just show them what the problem is. Passing functions to another module is like giving a plumber a toolbox which you put together by guessing what tools they might need. You're not a plumber, why should you decide what tools the plumber needs?
Currying encourages spaghetti code which is difficult to follow when functions are passed between different modules to complete the currying. In practice, if one can design code which gathers all the info it needs before calling the function once; this leads to much cleaner and much more readable code.