Locale Aware Sorting in JavaScript

0
50
Locale Aware Sorting in JavaScript


Problem

When building a localized JavaScript web-app, the default sorting logic for
strings doesn’t quite yield the results that you might expect. For example,
take the following example…

let strings = [ "nop", "NOP", "ñop", "abc", "abc", "äbc" ];
strings.sort();
console.log(strings);

If it weren’t for the accented characters, you could try lowercasing everything
to shift NOP to the intended place, but to properly sort with localization
in mind, this technique does not work.

You can jump to various sections of this blog post…

Solutions

Thankfully there are a couple of options that you can use to apply locale-aware
sorting
(localeCompare
and
Intl.Collator.
We will take a look at both of these approaches used inside the Array’s
sort
method, but first let’s briefly explain what a compareFunction is.

What is a compareFunction?

You can customize how an Array sorts by providing a
compareFunction
as an argument. This function takes two parameters (typically named a and b)
where the return value of the function is positive, negative, or 0.

  • If the result is negative, then a should be before b,
  • If the result is positive, then b should be before a,
  • If the result is zero, then a and b are equal.

The following is an example of what a compareFunction can look like. This
function forces each string to be compared after they have been lowercased. It
does not solve our sorting problem listed above. The sorting is a little better
than our original attempt, but it still doesn’t account for the special
locale-specific characters.

let strings = [ "nop", "NOP", "ñop", "abc", "abc", "äbc" ];
strings.sort((a, b) => {
  const lowerA = a.toLowerCase();
  const lowerB = b.toLowerCase();
  if (lowerA < lowerB) {
    return -1; 
  } else if (lowerA > lowerB) {
    return 1;  
  } else {
    return 0;  
  }
});
console.log(strings);

Using localeCompare in the compareFunction

As mentioned above, modern browsers have better techniques to compare strings
with locale in mind. First we will look at the
localeCompare
method off of the String prototype. This method follows the contract defined by the compareFunction as
described above. The function accepts two parameters and returns a positive
value, negative value, or 0 depending on how the parameters compare to
each other.

let strings = [ "nop", "NOP", "ñop", "abc", "abc", "äbc" ];
strings.sort((a, b) => a.localeCompare(b));
console.log(strings);

Sorting by an Object Property

When I need to sort in a web-app, I’m typically trying to sort objects in an
array. Thankfully, you can tweak the compareFunction to access the property
that needs sorting.

let objects = [
  { name: "nop", value: 3 },
  { name: "NOP", value: 2 },
  { name: "ñop", value: 1 },
  { name: "abc", value: 3 },
  { name: "abc", value: 2 },
  { name: "äbc", value: 1 },
];
objects.sort((a, b) => a.name.localeCompare(b.name));
console.log(objects);

Using Intl.Collator in the compareFunction

Another way to sort with language-sensitive string comparison is to use
Intl.Collator.
Using this approach, you use the Intl.Collator constructor and create a
collator object that will be used in your compareFunction. The collator has a
compare method that can be leveraged inside of the Array’s sort method.

let strings = [ "nop", "NOP", "ñop", "abc", "abc", "äbc" ];
const collator = new Intl.Collator('en');
strings.sort((a, b) => collator.compare(a, b)); 
console.log(strings);

Since the collator.compare method accepts the same parameters as the
compareFunction you can simplify the line above by passing the compare
directly to the sort method.

strings.sort(collator.compare);

You might be wondering why you should this approach versus the localeCompare
method in the previous section. MDN recommends that
you use the Intl.Collator for performance reasons when “comparing large numbers of
strings”
.

Sorting by an Object Property

You can also sort arrays of objects like we did in the previous example. In this
case we leverage the collator.compare method and pass along the properties
that we want to sort by.

let objects = [
  { name: "nop", value: 3 },
  { name: "NOP", value: 2 },
  { name: "ñop", value: 1 },
  { name: "abc", value: 3 },
  { name: "abc", value: 2 },
  { name: "äbc", value: 1 },
];
const collator = new Intl.Collator('en');
objects.sort((a, b) => collator.compare(a.name, b.name));
console.log(objects);

Sorting Objects with a Primary and Secondary Property

When you sort an array and have several matches exact matches, it is handy to
have a secondary property to sort by to break the tie. You can use the same
approach as above, but with a little more logic. Inside the compareFunction,
if the two properties have the same value (a zero compare value), then you can
compare again by a secondary property.

let objects = [
  { name: "nop", value: 3 },
  { name: "NOP", value: 2 },
  { name: "ñop", value: 1 },
  { name: "abc", value: 3 },
  { name: "abc", value: 2 },
  { name: "äbc", value: 1 },
];
const collator = new Intl.Collator('en');
objects.sort((a, b) => {
    
    let diff = collator.compare(a.name, b.name);
    if (diff === 0) {
        
        return a.value - b.value;
    }
    return diff;
});
console.log(objects);

Additional Locale Specific Sorting Options

Both the above sorting techniques have additioanl options that you can pass
along to help refine the sorting logic.

const collator = new Intl.Collator('en', {
  sensitivity: 'base',     
  caseFirst: 'upper',      
  usage: 'sort',           
  ignorePunctuation: true, 
  numeric: true,           
});

You find more details about these options in the TC39 documentation.



Source link

Leave a reply

Please enter your comment!
Please enter your name here