Pipes

Pipes is an experimental toy programming language about applying successive transformations to some data. Think bash with static typing. Hello world is "Hello World!" |print ;. See syntax below or see examples in ./pipes-programs/demos/.

This repo contains an interpreter written in rust and a couple related tools, like a web+wasm playground.

Roadmap

[x] means mostly done, [/] means partial support, [ ] means not done.

Basic language features:

Project organization

be able to split code in different files
...and be able to import identifiers
...with minimal project definition
...in a way that is compatible with LSP

Typechecking:

semantic types (:t meaning "the previous expression is of type t")
[/] type inference (like C++ auto)

Metaprogramming:

type parameters (aka generics or templates)
[/] eval (execute strings that contain code)
hygienic macros

Tooling

web playground (https://jmmut.itch.io/pipes)
check and run continuously (see pipes-check or the web playground)
[/] print code location in error messages
[/] debugger breakpoints
Language Server Protocol

Syntax

The general syntax is (operator parameter+)*. There's no operator precedence. All is left-associative and you can use braces {} to group expressions.

You can use these atoms:

64-bit signed integer numbers (e.g. 4, 'a' (syntax sugar for ASCII code 97))
- unary negation (e.g. -5)
Functions (e.g. function(x) {x} (which is the anonymous identity function))
- Functions can declare return types too: function(x)(:i64) {x}
Arrays (e.g. [1 2 3], allocated in the heap)
Identifiers (e.g. my_namespaced/variable2)
Strings (e.g. "hello world" (syntax sugar for an array of utf-8 ints))
Nothing (e.g. {} or {;}). This is like Rust's Option::None.

With these operations:

Nice to haves

be able to print call stacks in pipes code
const

Simplified Grammar

Worth a special mention are Chains and Types.

A Chain is delimited by braces ({, }) and contains a list of operations applied to a value. Examples are {5}, {;5 +3 -2} and {5 |print}.

Types are delimited by parenthesis and contain a list of named types (name, :, type). One of the name or the type can be omitted. Examples with one child are (x :i64), (x), (:i64); with 2 children (x y); with 3 children (second name omitted, third type omitted) (a :i64 :i64 c).

Chains and Types are composable in a few contexts. For example, a function is the literal function, a Types and a Chain, like function(x) {x}. A branch is branch and two Chains. A nested type is a name and a Types, like tuple(x :i64 y :i64).

A simplified grammar of most of the language:

Array = '[' Expression* ']'
Chain = '{' Operation* '}'
Types = '(' TypedIdentifier* ')'

Expression = Number | String | Array | Chain | Function | Branch | Type
Operation = Operator Expression+
Operator = '|' | ';' | '+' | '==' | ...
TypedIdentifier = Identifier ':' Type | Identifier | ':' Type
Type = Identifier Types?

Scope = Types Chain

Branch = 'branch' Chain Chain
Function = 'function' Scope
Map = 'map' Scope
BrowseOr = 'browse_or' Scope Chain

In a Scope, the first identifier of the Types is available as first value for the Chain, so you can do function(x) {+1} and it will increment its argument, or [4 5 6] |map(x) {+10} and create a new list [14 15 16].

IMHO it's quite elegant how Arrays [] are for data, Chains {} are for code executed at runtime, and Types () are for typechecking done at compile time (or at least done before runtime, as technically there's no compile time in an interpreter).

Architecture

The interpreter goes through these stages, starting from an entry-point source code file:

lex the source code file
parse it
- for each undefined identifier:
  - infer the source code file where it's defined, and lex and parse that file (recursive)
infer and check types of the entry-point file and all the imported identifiers
evaluate comptime expressions
runtime execution
- if eval is used, call all previous stages (including runtime) in a nested environment

Philosophy of the language

Most programs apply transformations to some value

In my experience, most programming is about having some input, applying some transformations, and returning some output.

Sure, some algorithms need to keep track of many variables, but ideally you want to do your code modular so that each piece is a simplified algorithm that is about modifying a datum, or a list of elements. This also makes tests wonderfully simple.

I think this transformation-oriented-programming is the reason why Bash (well, shells in general) are so handy for simple tasks. However, they quickly turn into a mess when you need to do anything more complex than an if and a loop, which is a shame. I think that happens because their main datatype is free text, which is why...

Pipes has static inferred types

Type checking will happen in every language, you can only decide if it will happen at compile time or at run time. And I think the earlier you can catch mistakes, the better.

Some people complain that static typing is worse at development speed and experimentation, but I think that's not necessarily true. In this language you don't need to specify types most of the time, but they will still be checked before having to run the program.

At the moment this project only has an interpreter backend, but I could add a compiler backend (as I did in a previous version of this project https://bitbucket.org/jmmut/pipes/), and the program [1 2 3] |function(x) {x+1} =a would fail typechecking before runtime, despite not mentioning any type, neither for the function definition, nor for the variable a. Typechecking fails because addition can not be done on arrays.

And as the project matures you can add types to make it more robust. A nice little feature is that you can also add names (for documentation purposes) to the returned types, and to the nested types:

[1 2 3]
|function(score :list(points :i64))(total_points :i64) { |sum }
==6

I hate operator precedence tables

I've had to look up this table too many times, so I made a language where that table doesn't exist: https://en.cppreference.com/w/cpp/language/operator_precedence

The problem with unary operators (or why `-5` is not supported)

Unary operators don't play well with the core idea of the language. It's like an operation with 0 parameters.

To express -5 % 7 I could support 5 |- % 7 or 5 |negative % 7, but it's quite unreadable. Too close to Forth.

So currently you need to put an initial 0 so that the minus becomes a binary operation: 0 -5 %7.

Miscellaneous trivia

If you export the test coverage from RustRover into a .lcov file, you can generate a html report with genhtml your_file.lcov.

Name		Name	Last commit message	Last commit date
Latest commit History 406 Commits
.github/workflows		.github/workflows
interpreter		interpreter
language_server		language_server
pipes_programs		pipes_programs
web		web
web_playground		web_playground
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
install_corelib.sh		install_corelib.sh
justfile		justfile
pipes_root.toml		pipes_root.toml
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Pipes

Roadmap

Syntax

Nice to haves

Simplified Grammar

Architecture

Philosophy of the language

Most programs apply transformations to some value

Pipes has static inferred types

I hate operator precedence tables

The problem with unary operators (or why `-5` is not supported)

Miscellaneous trivia

About

Uh oh!

Releases

Packages

Uh oh!

Languages

jmmut/pipes-rs

Folders and files

Latest commit

History

Repository files navigation

Pipes

Roadmap

Syntax

Nice to haves

Simplified Grammar

Architecture

Philosophy of the language

Most programs apply transformations to some value

Pipes has static inferred types

I hate operator precedence tables

The problem with unary operators (or why -5 is not supported)

Miscellaneous trivia

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

The problem with unary operators (or why `-5` is not supported)

Packages