The Dangers of JavaScript’s Automatic Semicolon Insertion

Although JavaScript is very powerful, the language’s fundamentals do not have a very steep learning curve.  Prior to the explosion of web applications, JavaScript was thought of as a toy language for amateur programmers.  Some of JavaScript’s features were specifically designed to cater to beginners.  One such feature is automatic semicolon insertion.  Automatic semicolon insertion is also one of JavaScript’s most controversial features.

JavaScript’s syntax borrows heavily from C and Java.  Programmers that are familiar with these languages are accustomed to semicolon terminated statements.  JavaScript statements are also terminated by semicolons, but unlike C and Java, these semicolons are not always required.  In an effort to “help” programmers, the JavaScript interpreter will actually insert omitted semicolons where it deems necessary.  Unfortunately, automatic semicolon insertion, which was intended to act as a crutch for programmers, can actually introduce difficult to find bugs.  In my opinion, it also promotes bad programming practices by not forcing developers to properly terminate statements.

Section 7.9.1 of the ECMAScript 5.1 Standard specifies rules governing automatic semicolon insertion.  To understand the rules, you should first understand the concept of tokens.  Most of the rules involve inserting a semicolon at the end of a line of code ― referred to as a LineTerminator token.  Because automatic semicolon insertion rules are related to line breaks, whitespace can effect the execution of JavaScript programs.  Many common languages (C, Java, HTML) allow developers to ignore whitespace.  Developers who are familiar with these languages can run into problems in JavaScript due to this assumption.

It is important to recognize the scenarios where automatic semicolon insertion is applied.  There are also a number of scenarios where semicolons are not automatically inserted.  The following sections describe the rules for inserting (or not inserting) semicolons automatically.

LineTerminator, Closing Braces, and End of Stream

The JavaScript interpreter will insert a semicolon between two statements when they are separated by a LineTerminator or a } token.  A semicolon will also be inserted, if needed, at the end of the input stream.  The following ‘if’ statement is, surprisingly, valid JavaScript.

if (i === 0) {
  foo = 1
  bar = 2 } baz = 3

A semicolon is inserted between the ‘foo’ and ‘bar’ assignment statements because they are separated by a LineTerminator.  Another semicolon is inserted after the ‘bar’ assignment because the next token is a closing curly brace.  A final semicolon is inserted after the ‘baz’ assignment because the end of the input stream has been reached.  After semicolon insertion, the ‘if’ statement looks like this:

return, throw, continue, and break Statements

If a LineTerminator token is encountered immediately after a ‘return’, ‘throw’, ‘continue’, or ‘break’ token, a semicolon is automatically inserted.  This means that labels in ‘continue’ and ‘break’ statements must be specified on the same line as the respective ‘continue’ or ‘break’ token.  Similarly, expressions in ‘return’ and ‘throw’ statements must begin on the same line as the ‘return’ or ‘throw’ token.  For example, the following ‘return’ statement does not exhibit the behavior that the developer likely intended.

return
a + b;

The developer most likely intended to return the result of the expression ‘a + b’.  However, when this ‘return’ statement is parsed by the interpreter, it is transformed to look like the following code.  In this case, the return value is undefined and the ‘a + b’ expression becomes unreachable code.

return;
a + b;

‘return’ statements that return object literals are potentially the most common victims of semicolon insertion related bugs.  Object literal syntax lends itself well to being split across multiple lines.  This is especially true for large objects.  For example, the following function returns an undefined value.

function getObject() {
  return
  {
    foo : 1
    // many more fields
  };
}

Postfix Operators

The postfix operators ‘++’ and ‘–’ must appear on the same line as their operand.  If a LineTerminator occurs between the operand and the operator, then a semicolon will be inserted by the interpreter.  These mistakes are uncommon.  For example, a developer is unlikely to write ‘i++’ on multiple lines.

for Statements

The header of a ‘for’ loop must always contain two semicolons.  According to the specification, semicolons are never automatically inserted into the header of a ‘for’ loop.  This means that the programmer is responsible for including both semicolons.  For example, the following loops are valid JavaScript:

for (var i = 0; i < 5; i++) {
  // loop body
}

for (; ;) {
  // loop body
}

for (var i = 0; i < 5;
i++) {
  // loop body
}

However, the following loops are not valid because the missing second semicolon is not automatically inserted.

for (var i = 0; i < 5
i++) {
  // loop body
}

for ( ;
) {
  // loop body
}

Empty Statements

Semicolons are also never inserted when the resulting statement would be the empty statement ― a statement that consists of only a semicolon.  The following ‘if-else’ statement is invalid.  The interpreter will not insert a semicolon in the ‘if’ clause because the resulting statement would be empty.

if (i === 5)
  // no semicolon will be inserted here
else
  foo = 0;

A More Complicated Example

All semicolons have been removed from the following example.  In this case, it can be more difficult to determine the semantics of the code.  What will the value of ‘foo’ be at the end of the example?

var foo
var bar
var baz = function(data) {
  return data +
  1
}

bar = 1
foo = bar + baz
(bar + bar) + baz(bar)

Let’s analyze the code.  The first three lines declare the variables ‘foo’, ‘bar’, and ‘baz’.  ’baz’ is a function that increments its ‘data’ argument by one.  Because the expression ‘data + 1′ begins on the same line as the ‘return’ token, ‘baz’ returns the expected value.  The ‘bar’ assignment statement is straightforward.  The ‘foo’ assignment is trickier.  A semicolon is not inserted between the last two lines because the opening parentheses indicates a function call that began on the previous line.  Therefore, the ‘foo’ assignment actually looks like this:

 foo = bar + baz(bar + bar) + baz(bar);

Now it is fairly simple to compute ‘foo’.  The final value of ‘foo’ is six.

Things to Remember

  • If the programmer leaves out a semicolon, the JavaScript interpreter will insert it automatically in some circumstances.
  • Automatic semicolon insertion can introduce bugs which are difficult to locate because whitespace changes semantics.
  • Programmers should never rely on automatic semicolon insertion.

Appendix

The following excerpt is taken directly from Section 7.9.1 of the standard, which describes the rules for automatic semicolon insertion.  The ‘restricted productions’ mentioned in Rule #3 relate to postfix operators and the ‘continue’, ‘break’, ‘return’, and ‘throw’ statements.

Begin Excerpt

There are three basic rules of semicolon insertion:

  1. When, as the program is parsed from left to right, a token (called the offending token) is encountered that is not allowed by any production of the grammar, then a semicolon is automatically inserted before the offending token if one or more of the following conditions is true:
    • The offending token is separated from the previous token by at least one LineTerminator.
    • The offending token is }.
  2. When, as the program is parsed from left to right, the end of the input stream of tokens is encountered and the parser is unable to parse the input token stream as a single complete ECMAScript Program, then a semicolon is automatically inserted at the end of the input stream.
  3. When, as the program is parsed from left to right, a token is encountered that is allowed by some production of the grammar, but the production is a restricted production and the token would be the first token for a terminal or nonterminal immediately following the annotation ―[no LineTerminator here]‖ within the restricted production (and therefore such a token is called a restricted token), and the restricted token is separated from the previous token by at least one LineTerminator, then a semicolon is automatically inserted before the restricted token.

However, there is an additional overriding condition on the preceding rules: a semicolon is never inserted automatically if the semicolon would then be parsed as an empty statement or if that semicolon would become one of the two semicolons in the header of a for statement (see 12.6.3).

End Excerpt

 

所以在javascript中很多项目经理都要求

function test(){
    console.log(‘test‘);
}

而不喜欢这种格式:

function test()
{
    console.log(‘test‘);
}

就是希望最大限度减少这种问题的出现

时间: 2024-12-10 10:06:22

The Dangers of JavaScript’s Automatic Semicolon Insertion的相关文章

7 个 JavaScript “特性”

原文链接:http://blog.scottlogic.com/2015/07/02/surprising-things-about-js.html more 从任何一个代码块中 break 你应该已经知道你可以从任意循环中 break 和 continue —— 这是一个相当标准的程序设计语言结构.但你可能没有意识到,你可以给循环添加一个 label ,然后跳出任意层循环: outer: for(var i = 0; i < 4; i++) { while(true) { continue o

总结的javascript编码规范(一)

今天看了<编写高质量的javascript>的第一部分,总结了一些javascript的编码规范. 一.格式化 1.使用4个空格来缩进层级. 注:由于有些操作系统或编译环境对于tab(制表符)的处理不尽相同,而空格无差异,所以使用空格来进行缩进. 2.每条语句的结尾都应加分号. 注:红宝书中也说过,为了防止压缩错误,最好不要省略. 3.每行的长度不应该超过80个字符. 4.若长度超过限制,可以在运算符后换行. 5.空行:方法和方法之间需要空行,方法和局部变量之间需要空行,注释之前需要空行,逻辑

Top 10 JavaScript traps for a C# developer

Top 10 JavaScript traps for a C# developer 27 May 2014   | .NET · Code · Javascript Tags: .net · C# · javascript If you are an experienced C# developer, coming into JavaScript world for application development, you will end up making few common mista

javascript中null和undefined的区别到底是什么?

8年前我开始学习js的时候,对我来说比较诡异的一个事情是undefined和null都代表空值.那么他们之间明确的不同点是什么呢?他们都能去定义空值,而且null == undefined的值也是TRUE. 大部分现代语言像Ruby,Python,或者Java都只有一个空值nil 或者null,  这是很明智的方法. 而js中,如果一个变量或者一个对象没有进行初始化,(编译器)就会返回一个undefined. 例如: let company; company; // => undefined l

7 个令人惊讶的 JavaScript “特

在过去的几个月里,我对 JSHint 做了一些改进,主要是,学习 ES6(我最自豪的是重新实现了变量作用域)的过程中我碰到了几个特性,它们让我惊讶,其中大部分是关于 ES6 的特性但也有一部分是 ES3 特性,这些特性我以前从未用过,而现在我将开始使用它们. 从任何一个代码块中 break 你应该已经知道你可以从任意循环中 break 和 continue —— 这是一个相当标准的程序设计语言结构.但你可能没有意识到,你可以给循环添加一个 label ,然后跳出任意层循环: outer: for

JavaScript简易教程

这是我所知道的最完整最简洁的JavaScript基础教程. 这篇文章带你尽快走进JavaScript的世界--前提是你有一些编程经验的话.本文试图描述这门语言的最小子集.我给这个子集起名叫做"JavaScript简易教程",并推荐准备深入阅读细节和高级技巧之前的新手阅读.心急吃不了热豆腐.文章的最后提出如何进一步学习. 警告:下面是我所描述的规则集和最佳实践.我喜欢整洁清晰(例如,你可以随时通过下面的目录快速导航).规则是无懈可击的,但不可避免--每个人的理解不同. 目录 1. 本文约

Google JavaScript Style Guide

转自:http://google.github.io/styleguide/javascriptguide.xml Google JavaScript Style Guide Revision 2.93 Aaron Whyte Bob Jervis Dan Pupius Erik Arvidsson Fritz Schneider Robby Walker Each style point has a summary for which additional information is ava

Expressions versus statements in JavaScript

Statements and expressions An expression produces a value and can be written wherever a value is expected. Expressions that look like statements Expressions that look like statements JavaScript has stand-alone blocks? It might surprise you that JavaS

代码检查工具jshint和csslint

前面的话 Douglas Crockford大神根据自己的理念用JavaScript写了一个JavaScript代码规范检查工具,这就是JSLint.后来非常流行,也的确帮助了广大的JavaScript程序员.但是,大神对于自己的代码规范不做丝毫的妥协,对开源社区的反馈的回应也不礼貌.于是,JSLint从一个帮助程序员规范代码,避免Bug的工具,变成了一个让代码像Crockford的工具.在最不信神的IT界,这当然不能忍了 2011年,一个叫Anton Kovalyov的前端程序员借助开源社区的