RFC: Revise `char` primitive

## State of `char` type

ReScript has the `char` primitive type, which is rarely used. (I was one of those who used `char` to handle ASCII keycodes)

https://rescript-lang.org/docs/manual/latest/primitive-types#char
> Note: Char doesn't support Unicode or UTF-8 and is therefore not recommended.

The `char` doesn't support Unicode, but only supports UTF-16 codepoint.

```res
let a = '👋'
```
compiles to
```js
let a = 128075;
```

Its value is the same as `'👋'.codePointAt(0)` result in JavaScript, which means that in the value representation, `char` is equivalent to `int` (16-bit subset).

Then, why don't we use just `int` instead of `char`?
1. `char` literals are automatically compiled to codepoints. This is much more efficient than string representation when dealing with the Unicode data table.
2. `char` supports range pattern (e.g. `'a' .. 'z'`) in pattern matching. This is very useful when writing parsers.

However, a char literal is not really useful to represent a Unicode character because it doesn't cover the entire Unicode sequence. It only returns the first codepoint value and discards the rest of the character segment.

To avoid problems, we should limit the value range of `char` literal to the BMP(Basic Multilingual Plane, U+0000~U+FFFF).

## Suggestion

I suggest some changes that would keep the useful parts of char but remove its confusion.

- Get rid of `char` type or make it an alias of `int`
- Keep char literal syntax, but with internal representation as regular integers
- Limit the char literal range to BMP in the syntax level.
- Support range patterns for regular integers
- Remove the `Char` module.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Revise `char` primitive #7028

State of `char` type

Suggestion

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RFC: Revise char primitive #7028

Description

State of char type

Suggestion

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

RFC: Revise `char` primitive #7028

State of `char` type