On Mon, 3 Jul 2000, Ka-Ping Yee wrote: > It doesn't even have to be its own class of error, i suppose, > as long as it gets indicated some way ("SyntaxError: invalid > indentation" would be fine). It turns out that this should be quite easy. If it weren't past 4am i would be posting a patch instead of just a verbal suggestion right now -- but here's how to do it. For this kind of thing: > >>> if 1: > ... 3 > ... 4 > inconsistent dedent > File "<stdin>", line 3 > 4 > ^ > SyntaxError: invalid token ...clearly it's trivial, as the case is already marked in the code (tokenizer.c). Instead of dumping the "inconsistent dedent" message to stderr, return E_INDENT. For this situation: > >>> 3 > File "<stdin>", line 1 > 3 > ^ > SyntaxError: invalid syntax ...we have an INDENT where none is expected. This is also easy. At the end of PyParser_AddToken, we simply check to see if the token that caused the problem was indent-related: if (type == INDENT || type == DEDENT) return E_INDENT; Finally, the most interesting case: > >>> if 1: > ... 3 > File "<stdin>", line 2 > 3 > ^ > SyntaxError: invalid syntax ...we expected an INDENT and didn't get one. This is a matter of checking the accelerator table to see what we were expecting. Also not really that hard: int expected; /* at the top of PyParser_AddToken */ ... if (s->s_lower == s->s_upper - 1) /* only one possibility */ { expected = ps->p_grammar->g_ll.ll_label[s->s_lower].lb_type; if (expected == INDENT || expected == DEDENT) return E_INDENT; } I like this last case best, as it means we can produce more useful messages for a variety of syntax errors! When there is a single particular kind of token expected, now Python can tell you what it is. After inserting this: /* Stuck, report syntax error */ fprintf(stderr, "Syntax error: unexpected %s", _PyParser_TokenNames[type]); if (s->s_lower == s->s_upper - 1) { fprintf(stderr, " (wanted %s)", _PyParser_TokenNames[labels[s->s_lower].lb_type]); } fprintf(stderr, "\n"); ... i played around a bit: >>> (3,4] Syntax error: unexpected RSQB (wanted RPAR) File "<stdin>", line 1 (3,4] ^ SyntaxError: invalid syntax >>> 3.. Syntax error: unexpected NEWLINE (wanted NAME) File "<stdin>", line 1 3.. ^ SyntaxError: invalid syntax >>> 3.) Syntax error: unexpected RPAR File "<stdin>", line 1 3.) ^ SyntaxError: invalid syntax >>> a^^ Syntax error: unexpected CIRCUMFLEX File "<stdin>", line 1 a^^ ^ SyntaxError: invalid syntax >>> if 3: ... 3 Syntax error: unexpected NUMBER (wanted INDENT) File "<stdin>", line 2 3 ^ SyntaxError: invalid syntax >>> 4,, Syntax error: unexpected COMMA File "<stdin>", line 1 4,, ^ SyntaxError: invalid syntax >>> [3,) Syntax error: unexpected RPAR (wanted RSQB) File "<stdin>", line 1 [3,) ^ SyntaxError: invalid syntax >>> if a == 3 and Syntax error: unexpected NEWLINE File "<stdin>", line 1 if a == 3 and ^ SyntaxError: invalid syntax >>> if a = 3: Syntax error: unexpected EQUAL (wanted COLON) File "<stdin>", line 1 if a = 3: ^ SyntaxError: invalid syntax This isn't going to cover all cases, but i thought it was pretty cool. So, in summary: - Producing E_INDENT errors is easy, and should require just three small changes (one in tokenizer.c and two in parser.c, specifically PyParser_AddToken) - We can get some info we need to produce better syntax error messages in general, but this requires a little more thought about how to pass the info back out of the parser to pythonrun.c (err_input). -- ?!ng "This code is better ihan any code that doesn't work has any right to be." -- Roger Gregory, on Xanadu
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4