Tags: deritchie/Fexl
Tags
No need to "reserve" a byte in safe_readlink, so use size instead of … …(size-1). It's a tiny change that makes no real difference, but it does allow the initial size to be as small as 1 and still work, instead of 2 as before. Of course, I use 256 anyway so that point is moot, but this code is cleaner. I also added readlink to test/a3.fxl.
Getting base path dynamically so I don't need to build into the program. Formerly I had logic in the build script to create stub .c files (in the obj directory) containing the base path of the program. Now I get it dynamically using /proc/self/exe. This is equivalent to the $0 in the /bin/sh program. See get_base_path in file.c. This is good preparation for possibly removing the direct linkage with the libfexl.so library, instead treating it like any other independent library such as openssl. I moved reduce_base_path from meta.c to file.c so it's closer to other things related to readlink. Note also how in fexl.c I'm using Fexl strings when building the full path instead of calling buf operations directly. Using Fexl strings avoids much of the tedium of string manipulation, and makes things more clearly correct. For another example of this, look how much simpler the safe_readlink routine is in file.c, now that it returns a Fexl string instead of setting complicated pointers passed in as arguments. This also made reduce2_readlink far simpler.
You can now call readlink(2) from Fexl, for example: readlink "/proc/self/exe" \path print "You are currently running ";print path;nl; I also changed the build script slightly so that its dependency analysis (via grep) does not insist that #include lines always end precisely in a double- quote. For example, you can now add spaces or comments on the end of #include lines, and the grep ignores them. I had just made it too strict before, by including '$' at the end of the pattern to match end-of-line immediately after the quote.
Make sure locally created files are not owned by root after doing bui… …ld as sudo. So, if you're running build under sudo, we change the owner and group of all locally created files to the underlying non-privileged user $SUDO_USER. That way you don't have any files annoyingly owned by root in your local directory. Here's a clear-cut test case which now works: sudo ./build quiet clean install ./build quiet clean install in ~/myfexl Before the change, you'd get a bunch of errors on the second command as it tried to delete object files owned by root created in the first command. With the change, everything works smoothly. Here's another clear-cut test case: sudo ./build quiet clean ./build erase
The build script now has an option to install in a directory other th… …an /usr. I also removed the explicit invocation of "sudo" there, because we don't want to require system privileges if you're just installing to a place of your own such as "~/myfexl". The /usr directory is still the default, but now if you're installing there you have to say "sudo ./build install" explicitly.
The build script now has an option to install in a directory other th… …an /usr. I also removed the explicit invocation of "sudo" there, because we don't want to require system privileges if you're just installing to a place of your own such as "~/myfexl". The /usr directory is still the default, but now if you're installing there you have to say "sudo ./build install" explicitly.
Major improvement in the propagation of type_open in parse.c. Recall that type_open was designed to indicate that a form contained symbols, i.e. it was not a closed form. That allows the lambda abstraction routine to avoid descending into closed forms, needlessly looking for symbols which of course are not there. To propagate the type_open, I would call an "apply" routine to apply two forms together. The correct rule is that if *either* of the two forms is open, then the result of applying them together is also open. Unfortunately, I've had a long-standing bug in the apply routine, which said that if *both* of the forms were open, the result was also open. This would fail to mark many forms as open. Consequently we could have forms that contain symbols which were *not* marked as open. Ordinarily that would manifest as a fatal error, and nothing would work at all. However, when I first unconsciously introduced this bug, who knows how long ago, I also put in a "work around" in the get_pattern routine, which said in effect go ahead and punch into *all* applicative forms, *regardless* of whether they're marked as open or not. That certainly worked around what would ordinarily be a fatal error, but it also pretty much negated all the efficiency gains of using type open in the first place! In this new version I've resolved the issue entirely, and I now propagate type_open correctly. The result is that the a3.fxl test, which deliberately mentions every symbol defined in the standard environment, now runs in 0.04s instead of 0.11s -- i.e. it's nearly 3 times faster. That's not a lot of absolute clock time, but there's a larger issue here, apart from just having clean and correct code. I discovered this error while I was in the process of removing the "symbol stacks" from parse.c., i.e. inner_syms and outer_syms. The idea is that at the end of the parse, instead of relying on outer_syms to tell me what symbols the form contains, I can just look in the form itself, grabbing the first symbol in left to right order, abstracting that symbol out, applying that form to the definition of the symbol, and then repeating until there are no more symbols. That's actually screaming fast, especially when type_open is propagated correctly. Well, my initial attempts were not working at all because I was relying on the accuracy of type_open. I quickly discovered the problem, which led to the revision you see here. Now I can once again resume work on removing the symbol stacks. Moreover, I have a much larger goal in mind. Instead of using lambda pattern substitution, I plan to go back to using combinators. I was using combinators long ago, but I found that lambda substitution was much faster. But now I'm pretty sure that's only because I was using combinators which behaved in an ordinary way, meaning that they had to churn their way through the evaluation process one step at a time. For example, (S x y z) became ((x z) (y z)), and then x would absorb its argument z as another combinatorial reduction, and the churning would continue from there. It all worked of course, but it was slow in comparison with the pattern substitution I currently use. Now I realize that I can do combinators which behave far more aggressively, if I switch to using reduction routines which look like this (in the C code): value z = x->T(x,y); That's how we'll do aggressive function application. We'll use A(x,y) for lazy function application, which is still necessary in cases such as the right side of the S combinator rule. The S combinator should look like roughly like this: value left = x->T(x,z); return left->T(left,A(y,z)); Of course, I could define a function app(x,y) as a shorthand for x->T(x,y), and the C compiler is very good at inlining such definitions. In that case the S combinator could look like this: return app(app(x,z),A(y,z)); Note the aggressive form on the left side, and the lazy form on the right side. This commit message has reached epic proportions, but this is basically how I "blog" these days. I'm getting back to work now. Talk to you soon!
To avoid the despised obscure issues I mentioned in the last commit, … …I re- factored the parse code so everything looks a lot more natural and compact, in particular parse_lambda. There aren't as many arbitrary looking hold and drop operations, because now the "apply" and "lambda" routines are smart about doing the hold and drop on their arguments. Also, because they handle NULL cases intelligently, the various parse results can be chained together more simply without so many conditional checks, even when there's a syntax error. You'll also notice that the "push" and "pop" calls are more isolated. In an upcoming version we won't need any of that because we'll be computing the lambda patterns from the bottom-up instead of top-down as we do now in get_pattern. Furthermore, the lambda routine now calls a new "copy_exp" routine to make a shallow copy of f->L, instead of wrapping it inside A(I,f->L) instead. Note that we have to do *something* special because this is a case where the result f->L is a subcomponent of the parent structure f, and we're about to *drop* f because it's no longer used. That would be a dangerous operation unless we handle it consciously and correctly. The nice thing about doing the copy instead of applying I to it is that it greatly compresses some of the parsed forms, as you can see by looking at the diffs in a1.out. Unfortunately a1.out looks like a binary file in github because it has some control characters, so it's hard to see the diffs on github. But if you look into it, you can see some serious reductions. For example, this form: (((((lam (C (C (I C)))) ((lam (C (C I))) (I (long_add 4)))) long_add) 4) 5) is now simply (((I long_add) 4) 5). Also, this form: ((lam (C (C (I C)))) ((lam (C (C I))) (I (<= ==)))) is now simply I. You may also notice that in value.c I've eliminated the "create" routine. Formerly the A, D, and Q routines were defined in terms of create. Strangely, when I added the new copy routine, and defined it in terms of create, I found that the a2 test slowed down, taking 19.5 seconds to run instead of 16.8! Even more strangely, it was the mere *presence* of the new copy routine which caused the slowdown -- I didn't even have to call it or use it in any way! Evidently because copy was calling create, it pushed something over the edge in the C compiler so that create was no longer being inlined, and we suddenly saw extra function call overhead. To work around that issue, I flipped things on its head so now "A" is the primary routine, and D, Q, and copy are defined in terms of A.
Fixed problem in parse.c line 442 where I drop def, but def happens t… …o be equal to val because we hit the special I pattern in the call to apply above it. In that case we *need* val and don't want to discard it. Notice that this fixes two test cases in a1.fxl which I stupidly didn't check carefully when I did the previous commit that removed implicit eager semantics from definitions. I despise obscure issues like this, preferring self-evident clarity in the code everywhere. The good news is, I'm currently working on a separate branch where all my parse functions return a value of the form (pair ok; pair exp; symbols). The purpose of symbols is to compute the lambda substitution patterns from the bottom up, instead of reaching down from the top in get_pattern. A side effect of this technique is that we should no longer have obscure cases due to optimizations, since we'll always be holding references inside the new higher-level values being returned from the parse functions, instead of, as in the case of this recent bug, inadvertently dropping a reference to something we still need.
PreviousNext