Open
Description
tl;dr: Is it possible to define a tokenizer that does not require a callback using logos with lalrpop?
In this tutorial http://lalrpop.github.io/lalrpop/lexer_tutorial/005_external_lib.html, a token for lexing identifiers is declared:
#[derive(Logos, Clone, Debug, PartialEq)]
#[logos(skip r"[ \t\n\f]+", skip r"#.*\n?", error = LexicalError)]
struct Token {
// ...
#[regex("[_a-zA-Z][_0-9a-zA-Z]*", |lex| lex.slice().to_string())]
Identifier(String),
// ...
}
in addition to a parser that identifies lexed identifiers:
pub Term: Box<ast::Expression> = {
// ...
<name:"identifier"> => {
Box::new(ast::Expression::Variable(name))
},
// ...
}
What I am noticing is that if a callback is not offered to logos' regex
macro, name
in the parser binds the token itself, as opposed to its value. But offering a callback is not required – in theory – because a token returned by lgoos' lexer includes its lexed text the lexer can return the slice a given token matches. For example:
use logos::Logos;
#[derive(Logos, Debug, PartialEq)]
#[logos(skip r"[ \t\n\f]+")]
enum Token {
// Note that there is no callback passed to the `regex` macro
#[regex("[a-zA-Z]+")]
Text,
}
#[cfg(test)]
mod tests {
fn t0() {
let mut lex = Token::lexer("sometext");
assert_eq!(lex.next(), Some(Ok(Token::Text)));
assert_eq!(lex.slice(), "sometext");
}
}
``
That said, I haven't yet found a way to use lalrpop with logos without providing a callback. In the example from the tutorial, this fails to compile if the callback is removed:
```rust
pub Term: Box<ast::Expression> = {
// ...
<name:"identifier"> => {
Box::new(ast::Expression::Variable(name.slice()))
},
// ...
}
Is it possible to define a tokenizer that does not require a callback using logos with lalrpop?
Metadata
Metadata
Assignees
Labels
No labels