.

Tags:

Inconsistency begets insanity. If every developer follows the agreed coding conventions, life feels more wonderful. When a string literal can be enclosed in single or double quotes (ECMAScript 5 specification section 7.8.4), often it helps to stick with one type of quotes. For example, jQuery code style mandates the use of double-quotes.

Personally I prefer single-quotes. That’s just my preference, though. When looking at Esprima, I realize that I can use its non-destructive partial modification feature (see also the summary of its other features, from source location info to code generation) to force every string literals to use single-quotes. And thus the following singlequote.js example was born.

var fs = require('fs'),
    esprima = require('esprima'),
    input = process.argv[2],
    output = process.argv[3],
    offset = 0,
    content = fs.readFileSync(input, 'utf-8'),
    tokens = esprima.parse(content, { tokens: true, range: true }).tokens;
 
function convert(literal) {
    var result = literal.substring(1, literal.length - 1);
    result = result.replace(/'/g, '\'');
    return ''' + result + ''';
}
 
tokens.forEach(function (token) {
    var str;
    if (token.type === 'String' && token.value[0] !== '\'') {
        str = convert(token.value);
        content = content.substring(0, offset + token.range[0]) + str +
            content.substring(offset + token.range[1] + 1, content.length);
        offset += (str.length - token.value.length);
    }
});
fs.writeFileSync(output, content);

Run it with Node.js like this:

node singlequote.js inputfile outputfile

How does this work? Let’s assume that the content of the input file is:

console.log("Hello")

When we ask Esprima parser to consume it, with the option tokens set to true, the parser also outputs the list of all tokens collected during the parsing process in an array. For our example above, the array is:

[
    { type: "Identifier", value: "console", range: [0, 6] },
    { type: "Punctuator", value: ".", range: [7, 7] },
    { type: "Identifier", value: "log", range: [8, 10] },
    { type: "Punctuator", value: "(", range: [11, 11] },
    { type: "String", value: ""Hello"", range: [12, 18] },
    { type: "Punctuator", value: ")", range: [19, 19] }
]

Once the tokens are available, all we have to do is to iterate and find the token associated with a string literal. Each token also contains the location info in its range property which denotes the zero-based start and end position (inclusive). Of course, what interests us is only the String token:

    { type: "String", value: ""Hello"", range: [12, 18] }

This facilitates some string operations to replace the original source, for the above example it’s between [12, 18]. Care must be taken that if the literal value contains one or more single-quotes, those single-quotes must be properly escaped (see SingleEscapeCharacters in section 7.8.4). Since this may change the total literal length, offset adjustment is often needed as well. An example follows:

// before
"color = 'blue'";
 
// after
'color = 'blue'';

The conversion still does not have the ability to do the reverse, i.e. removing unnecessary escaped characters. This is the case where double-quotes in the literal need not be escaped anymore. This functionality is left as an exercise to the readers!

Obviously this tool is nothing more than an academic exercise. Most editor supports search-replace, though you need to be careful not to change unrelated quotes intentionally. I’m sure there is an IDE out there which can carry out the same task efficiently. I do hope that whatever techniques you would use would take into account the escaping issue mentioned above.

Got some other ideas with the token list and partial modification?

  • http://ariya.ofilabs.com/ Ariya Hidayat

    In case someone wants to use the above example code, herewith I declare that it’s Public Domain.

  • http://ariya.ofilabs.com/ Ariya Hidayat

    In case someone wants to use the above example code, herewith I declare that it’s Public Domain.

  • http://paulirish.com Paul Irish

    While jQuery coding guidelines state double quotes, I’m fairly sure the dev team agrees it’s a legacy decision and they all prefer single quotes in JS. Whoops.

    • http://ariya.ofilabs.com/ Ariya Hidayat

      Time to convert those

      string literals :)

  • http://paulirish.com Paul Irish

    While jQuery coding guidelines state double quotes, I’m fairly sure the dev team agrees it’s a legacy decision and they all prefer single quotes in JS. Whoops.

    • http://ariya.ofilabs.com/ Ariya Hidayat

      Time to convert those

      string literals :)

  • http://twitter.com/eriwen Eric Wendelin

    Agreed on the single quotes front. Since I write in so many languages with String interpolation, I keep all non-interpolated strings within single quotes, so there’s less to think about.

  • http://twitter.com/eriwen Eric Wendelin

    Agreed on the single quotes front. Since I write in so many languages with String interpolation, I keep all non-interpolated strings within single quotes, so there’s less to think about.

  • Krinkle Krinkle

    Little typo in the if statement: ` token.value[0] !== ”’ `.

    Needs to be escaped, or use double quotes 😉

    • http://twitter.com/ariyahidayat Ariya Hidayat

      Good catch, thanks!

  • 紫云飞

    If one string literal’s value is “‘”, then the escaped value will be ‘\”, and this will cause a syntax error.

  • http://www.jennieandscott.com/scott Scott Rippey

    I agree that consistency is key. So I actually use BOTH, consistently.
    In JS, strings can serve

    distinct purposes: messages or keywords. So I
    use single-quotes for all code-related keywords, and double-quotes for
    all messages.

    Contrived example: console[‘log’](“Hello”);

    I feel that this convention actually adds semantic context to my strings,
    and when I see “double quotes” I know the text is arbitrary, and when I
    see ‘singles’ I know the text is a keyword.