Releases: cometkim/unicode-segmenter
[email protected]
Patch Changes
-
a5f486f: Fix bloat in the NPM package.
package.tgz
was mostly bloated by CommonJS interop and sourcemap.However, sourcemap isn't necessary here as it uses sources as is,
and the CommonJS shouldn't be different.Now fixed by simpler transpilation for CommoJS entries, and removed sourcemap files.
Also removed inaccessible entries.So the unpacked total package size has been down to 135 KB from 250 KB
Note: Node.js v22 will stabilize
require(ESM)
, which will allow CommonJS projects to use this package without having to maintain separate entries. I'm very excited about that, and looking forward to it becoming more "common". The first major release may consider ending support for CommonJS entries and TypeScript's"Node"
resolution.
[email protected]
[email protected]
[email protected]
Minor Changes
-
ffb41fb: Code size is signaficantly reduced, minified JS now works in half
There are also some performance improvements.
Not that much, but getting improvement on size without giving it up is a huge win.-
Compress Unicode data more in Base36
-
Changed the internal representation into TypedArray to improve its access pattern.
-
Shrank the grapheme lookup table size.
This does not impact performance except for some edges like Hindi and Demonic, but it does reduce the bundle size.
-
-
9e0feca: Update to Unicode® 16.0.0
[email protected]
Patch Changes
- 3665cf7: Fix Hindi text segmentation
[email protected]
[email protected]
Patch Changes
- 447b484: Fix polyfill to do not override existing, and also to be assigned as non-enumerable
[email protected]
Patch Changes
-
04fe2fc: Fix sourcemap reference error
- Include missing sourcemap files for transformed cjs entries
- Remove unnecessary transforms for esm entries and remove source map reference
[email protected]
Minor Changes
-
657e31a: semi-breaking: removed
_cat
from grapheme cluster segments because it was uselessInstead, added
_catBegin
and_catEnd
as beginning/end category of segments, which are possibly useful to infer applied boundary rules.
[email protected]
Minor Changes
-
f5ec709: Deprecated
isEmoji(cp)
in favor ofisExtendedPictogrphic(cp)
.There are no differences, but it was confused with the
\p{Emoji}
Unicode property.(Note:
\p{Emoji}
is not useful in actual use cases, see)
Patch Changes
- 5bf4d29: Fix the TypeScript definition for GraphemeCategory enum