Locale Aware Sorting in JavaScript

Published

March 28, 2022

Reading time
6 min read
This post is part of my Byte Series, where I document tips, tricks, and tools that I've found useful.

Problem

When building a localized JavaScript web-app, the default sorting logic for strings doesn't quite yield the results that you might expect. For example, take the following example…

let strings = [ "nop", "NOP", "ñop", "abc", "abc", "äbc" ];
strings.sort();
console.log(strings);
// ['NOP', 'abc', 'abc', 'nop', 'äbc', 'ñop'] 

If it weren't for the accented characters, you could try lowercasing everything to shift NOP to the intended place, but to properly sort with localization in mind, this technique does not work.

You can jump to various sections of this blog post…

Solutions

Thankfully there are a couple of options that you can use to apply locale-aware sorting (localeCompare and Intl.Collator. We will take a look at both of these approaches used inside the Array's sort method, but first let's briefly explain what a compareFunction is.

What is a compareFunction?

You can customize how an Array sorts by providing a compareFunction as an argument. This function takes two parameters (typically named a and b) where the return value of the function is positive, negative, or 0.

  • If the result is negative, then a should be before b,
  • If the result is positive, then b should be before a,
  • If the result is zero, then a and b are equal.

The following is an example of what a compareFunction can look like. This function forces each string to be compared after they have been lowercased. It does not solve our sorting problem listed above. The sorting is a little better than our original attempt, but it still doesn't account for the special locale-specific characters.

let strings = [ "nop", "NOP", "ñop", "abc", "abc", "äbc" ];
strings.sort((a, b) => {
  const lowerA = a.toLowerCase();
  const lowerB = b.toLowerCase();
  if (lowerA < lowerB) {
    return -1; // A is less than B
  } else if (lowerA > lowerB) {
    return 1;  // A is greater than B 
  } else {
    return 0;  // A and B are equal
  }
});
console.log(strings);
// ['abc', 'abc', 'nop', 'NOP', 'äbc', 'ñop']

Using localeCompare in the compareFunction

As mentioned above, modern browsers have better techniques to compare strings with locale in mind. First we will look at the localeCompare method off of the String prototype. This method follows the contract defined by the compareFunction as described above. The function accepts two parameters and returns a positive value, negative value, or 0 depending on how the parameters compare to each other.

let strings = [ "nop", "NOP", "ñop", "abc", "abc", "äbc" ];
strings.sort((a, b) => a.localeCompare(b));
console.log(strings);
// ['abc', 'abc', 'äbc', 'nop', 'NOP', 'ñop']

Sorting by an Object Property

When I need to sort in a web-app, I'm typically trying to sort objects in an array. Thankfully, you can tweak the compareFunction to access the property that needs sorting.

let objects = [
  { name: "nop", value: 3 },
  { name: "NOP", value: 2 },
  { name: "ñop", value: 1 },
  { name: "abc", value: 3 },
  { name: "abc", value: 2 },
  { name: "äbc", value: 1 },
];
objects.sort((a, b) => a.name.localeCompare(b.name));
console.log(objects);
/*
[
  { "name": "abc", "value": 3 },
  { "name": "abc", "value": 2 },
  { "name": "äbc", "value": 1 },
  { "name": "nop", "value": 3 },
  { "name": "NOP", "value": 2 },
  { "name": "ñop", "value": 1 }
]
*/

Using Intl.Collator in the compareFunction

Another way to sort with language-sensitive string comparison is to use Intl.Collator. Using this approach, you use the Intl.Collator constructor and create a collator object that will be used in your compareFunction. The collator has a compare method that can be leveraged inside of the Array's sort method.

let strings = [ "nop", "NOP", "ñop", "abc", "abc", "äbc" ];
const collator = new Intl.Collator('en');
strings.sort((a, b) => collator.compare(a, b)); 
console.log(strings);

Since the collator.compare method accepts the same parameters as the compareFunction you can simplify the line above by passing the compare directly to the sort method.

strings.sort(collator.compare);

You might be wondering why you should this approach versus the localeCompare method in the previous section. MDN recommends that you use the Intl.Collator for performance reasons when "comparing large numbers of strings".

Sorting by an Object Property

You can also sort arrays of objects like we did in the previous example. In this case we leverage the collator.compare method and pass along the properties that we want to sort by.

let objects = [
  { name: "nop", value: 3 },
  { name: "NOP", value: 2 },
  { name: "ñop", value: 1 },
  { name: "abc", value: 3 },
  { name: "abc", value: 2 },
  { name: "äbc", value: 1 },
];
const collator = new Intl.Collator('en');
objects.sort((a, b) => collator.compare(a.name, b.name));
console.log(objects);
/*
[
    { "name": "abc", "value": 3 },
    { "name": "abc", "value": 2 },
    { "name": "äbc", "value": 1 },
    { "name": "nop", "value": 3 },
    { "name": "NOP", "value": 2 },
    { "name": "ñop", "value": 1 }
]
*/

Sorting Objects with a Primary and Secondary Property

When you sort an array and have several matches exact matches, it is handy to have a secondary property to sort by to break the tie. You can use the same approach as above, but with a little more logic. Inside the compareFunction, if the two properties have the same value (a zero compare value), then you can compare again by a secondary property.

let objects = [
  { name: "nop", value: 3 },
  { name: "NOP", value: 2 },
  { name: "ñop", value: 1 },
  { name: "abc", value: 3 },
  { name: "abc", value: 2 },
  { name: "äbc", value: 1 },
];
const collator = new Intl.Collator('en');
objects.sort((a, b) => {
    // Compare the strings via locale
    let diff = collator.compare(a.name, b.name);
    if (diff === 0) {
        // If the strings are equal compare the numbers
        return a.value - b.value;
    }
    return diff;
});
console.log(objects);
/*
[
    { "name": "abc", "value": 2 }, // name is same, sort by value
    { "name": "abc", "value": 3 }, // name is same, sort by value
    { "name": "äbc", "value": 1 },
    { "name": "nop", "value": 3 },
    { "name": "NOP", "value": 2 },
    { "name": "ñop", "value": 1 }
]
*/

Additional Locale Specific Sorting Options

Both the above sorting techniques have additional options that you can pass along to help refine the sorting logic.

const collator = new Intl.Collator('en', {
  sensitivity: 'base',     // base, accent, case, variant 
  caseFirst: 'upper',      // upper, lower, false
  usage: 'sort',           // sort, search
  ignorePunctuation: true, // true, false
  numeric: true,           // true, false
});

You find more details about these options in the TC39 documentation.

Web Mentions
0
0

Tweet about this post and have it show up here!