diff --git a/CHANGELOG.md b/CHANGELOG.md
index fbc7b6f..bd6c512 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -83,8 +83,14 @@
 - `ClassDeclaration`: support `extends` keyword (ngoài dạng paren)
 - Docs: xóa JSON-based dummy output từ Vite/Webpack plugin
 
+### Tokenizer
+- State-machine tokenizer: char-code dispatch (`packages/parser/src/tokenizer.ts`), keyword trie với 3 boundary kinds (`WORD` cho ASCII `\b`, `IDENT` cho keyword VI kết thúc non-ASCII, `NONE` cho keyword không cần biên), operator longest-match trie, bounded backtracking cho identifier đa-từ vs keyword đa-từ
+- Hạ tầng bench: 5 fixtures (tiny / medium / keywordHeavy / stringHeavy / large), `pnpm bench` + `pnpm bench:baseline` (dump JSON)
+- Snapshot drift tests + smoke parity tests cho fixtures
+- Doc: `docs/architecture/tokenizer.md`
+
 ### Stats
-- Tests: 70 → 298 (100% pass)
+- Tests: 70 → 357 (100% pass)
 - Coverage: 79.85% → 92.44% statements / 71.23% → 85.88% branches / 84.89% → 96.25% functions
 - Compatibility matrix: 40.8% → 98.5% complete (135/137 features)
 
diff --git a/README.md b/README.md
index a7208bf..42dfc02 100644
--- a/README.md
+++ b/README.md
@@ -73,9 +73,12 @@ VietScript giữ ngữ nghĩa JavaScript 100% — chỉ thay keyword tiếng Anh
 | ✅ | Escape sequence đầy đủ (`\n`, `\x41`, `\u{1F600}`, v.v.) |
 | ✅ | Error messages tiếng Việt có file:line:col + snippet |
 | ✅ | Source maps (debug stack trace trỏ về file `.vjs`) |
+| ✅ | Tokenizer state-machine (char-code dispatch + keyword trie + bounded backtracking) |
 
 **Kiểm tra chi tiết:** [docs/compatibility.md](docs/compatibility.md) — 70.9% ✅ complete, 24.6% 🟡 partial, 4.5% ❌ missing.
 
+**Kiến trúc tokenizer:** [docs/architecture/tokenizer.md](docs/architecture/tokenizer.md).
+
 **Lộ trình:** [docs/roadmap.md](docs/roadmap.md).
 
 ## Dự án cấu trúc
@@ -94,8 +97,9 @@ packages/
 
 ```bash
 pnpm install
-pnpm test                 # 249 test
+pnpm test                 # 402 test
 pnpm test:coverage        # coverage report (≥88% statements)
+pnpm bench                # benchmark tokenizer (regex vs FSM)
 pnpm lint
 pnpm build                # build tất cả package
 ```
diff --git a/docs/.vitepress/config.ts b/docs/.vitepress/config.ts
index a958d69..258d25d 100644
--- a/docs/.vitepress/config.ts
+++ b/docs/.vitepress/config.ts
@@ -31,6 +31,12 @@ export default defineConfig({
           { text: 'Câu lệnh duyệt', link: '/basics/switch-case' },
         ],
       },
+      {
+        text: 'Kiến trúc',
+        items: [
+          { text: 'Tokenizer (state machine)', link: '/architecture/tokenizer' },
+        ],
+      },
     ],
 
     socialLinks: [{ icon: 'github', link: 'https://github.com/vuejs/vitepress' }],
diff --git a/docs/architecture/tokenizer.md b/docs/architecture/tokenizer.md
new file mode 100644
index 0000000..e9bbbaa
--- /dev/null
+++ b/docs/architecture/tokenizer.md
@@ -0,0 +1,125 @@
+# Kiến trúc Tokenizer
+
+Tokenizer của VietScript là một **state machine** chạy thủ công trên chuỗi nguồn `.vjs`, kết hợp **trie ký tự** cho keyword và toán tử, cùng **bounded backtracking** cho identifier đa-từ. Trang này mô tả thiết kế, các thành phần chính, và cách thêm keyword mới.
+
+- File: [`packages/parser/src/tokenizer.ts`](https://github.com/imrim12/vietscript/blob/main/packages/parser/src/tokenizer.ts)
+- API: `getNextToken()`, `isEOF()`, `rollback(step)`, `getCursor()`.
+
+## Vòng lặp chính
+
+`getNextToken()` đọc `source.charCodeAt(cursor)` rồi rẽ nhánh trực tiếp theo char-code:
+
+```
+DEFAULT
+ ├── whitespace      → skip
+ ├── '/' '/'          → line comment, skip đến '\n'
+ ├── '/' '*'          → block comment, skip đến '*/'
+ ├── '`'              → scanTemplateLiteral
+ ├── '/' (regex ctx)  → scanRegexLiteral
+ ├── '"' | "'"        → scanString
+ ├── digit | '.digit' → scanNumber
+ ├── ident-start      → scanIdentifierOrKeyword
+ └── otherwise        → scanOperator (longest-match trie)
+```
+
+Không có cấp phát chuỗi tạm và không gọi regex engine trong hot path — chỉ index vào source qua `charCodeAt`.
+
+## Keyword trie
+
+Tất cả keyword (English + Vietnamese aliases) được build thành **một trie** keyed theo char-code, kể cả ký tự space cho keyword đa-từ như `khai báo`, `phá vòng lặp`, `kiểu của`, `khởi tạo cha`, `không xác định`.
+
+Mỗi entry mang một quy tắc biên (boundary) để xác định xem một keyword có thực sự đứng độc lập hay đang đứng cạnh ký tự định danh khác:
+
+| Boundary | Ý nghĩa | Ví dụ keyword |
+|---|---|---|
+| `WORD` | Char tiếp theo không được là `[A-Za-z0-9_]` (giả lập JS `\b`) | `var`, `let`, `for`, `khai báo`, `nếu` |
+| `IDENT` | Char tiếp theo không được là `[A-Za-zÀ-ỹ]` (cho keyword VI kết thúc bằng ký tự non-ASCII) | `riêng tư`, `bảo vệ`, `chờ`, `khi mà` |
+| `NONE` | Không kiểm tra biên (cho keyword có thể đứng cạnh identifier khác) | `else`, `return`, `try`, `as`, `from`, `const`, `async` |
+
+Lookup là một vòng đơn đi xuống trie:
+
+```
+1. Bắt đầu ở root, vị trí p = cursor.
+2. Đọc charCodeAt(p), bước xuống child tương ứng.
+3. Nếu node có .type và boundary qua → ghi nhận match dài nhất.
+4. Lặp đến khi không còn child khớp.
+5. Trả về match dài nhất (nếu có).
+```
+
+Độ phức tạp: O(L) với L = độ dài keyword dài nhất (hằng số). Không phụ thuộc số keyword.
+
+## Backtracking giới hạn cho identifier đa-từ
+
+VietScript cho phép identifier đa-từ ngăn cách bằng space, ví dụ `con mèo đẹp = 1`. Tokenizer cần phân biệt:
+
+- `khai báo con mèo đẹp = 1` → `[VAR, IDENT('con mèo đẹp'), '=', ...]`
+- `khai báo một lớp gì đó = 1` → phải dừng identifier trước `lớp` (vì `lớp` là keyword), tạo lỗi syntax có chủ đích.
+
+Logic ở `scanIdentifier`:
+
+```
+1. Đọc word đầu (ident-start, ident-cont*).
+2. Loop:
+   - Nếu next char là space và char sau space là ident-start:
+     - Peek: từ vị trí sau-space, gọi matchKeyword().
+     - Nếu thấy keyword → DỪNG, không nuốt space.
+     - Nếu không → consume space + word, continue.
+   - Else → DỪNG.
+```
+
+Đây là backtracking **bounded**: peek tối đa độ dài keyword dài nhất (hằng số), không phải full backtracking parser.
+
+## Operator longest-match trie
+
+Tương tự keyword trie nhưng cho dấu toán tử. Đảm bảo `>>>=` thắng `>>>`, `>>>` thắng `>>=`, `>>` thắng `>` v.v. — không phụ thuộc thứ tự khai báo trong mảng `OPERATORS`.
+
+## Phân biệt regex literal vs phép chia
+
+Tokenizer track `lastTokenType`. Nếu token trước đó nằm trong tập `REGEX_PRECEDING_TOKENS` (sau `=`, `(`, `,`, `return`, ...), `/` đầu = regex literal; ngược lại = toán tử chia.
+
+## Template literal & regex literal
+
+Hai phần này được scan thủ công bằng `charCodeAt` lookup vì chúng có nội tại nested:
+
+- **Template literal** hỗ trợ `${ ... }` lồng nhau, bên trong có thể chứa template literal khác (đệ quy qua một stack độ sâu).
+- **Regex literal** hỗ trợ character class `[...]` (slash bên trong không kết thúc regex), escape `\/`, và flags `[a-z]*` ở cuối.
+
+## Bench
+
+```bash
+pnpm bench                    # chạy bench, in bảng so sánh giữa các fixture
+pnpm bench:baseline           # chạy bench, dump JSON vào packages/parser/bench/baseline.json
+```
+
+File bench:
+- [`packages/parser/src/__bench__/tokenizer.bench.ts`](https://github.com/imrim12/vietscript/blob/main/packages/parser/src/__bench__/tokenizer.bench.ts) — bench tokenizer trên 5 fixture (tiny / medium / keywordHeavy / stringHeavy / large).
+- [`packages/parser/src/__bench__/parser.bench.ts`](https://github.com/imrim12/vietscript/blob/main/packages/parser/src/__bench__/parser.bench.ts) — bench end-to-end parse.
+- [`packages/parser/src/__bench__/fixtures/`](https://github.com/imrim12/vietscript/blob/main/packages/parser/src/__bench__/fixtures) — source `.vjs` đại diện.
+
+Khi thay đổi tokenizer, chạy bench trước/sau và commit kèm PR để dễ review.
+
+## Kiểm chứng
+
+3 lớp test bảo vệ hành vi:
+
+1. **Tokenizer behavior** — `tokenizer-edge.test.ts`, `vietnamese-keywords.test.ts`, `identifier-match-keyword.test.ts` cover edge cases trực tiếp.
+2. **Snapshot drift** — `bench-fixture-snapshot.test.ts` cố định output token (type|value|start-end) cho mỗi fixture. Sửa tokenizer mà ra output khác → test fail và phải xem lại.
+3. **Smoke** — `bench-fixture-parity.test.ts` đảm bảo mọi fixture tokenize được đến EOF không lỗi.
+
+Cộng thêm 70+ parser tests gián tiếp cover tokenizer qua từng node loại AST.
+
+## Thêm keyword mới
+
+1. Thêm vào enum: [`packages/shared/parser/keyword.enum.ts`](https://github.com/imrim12/vietscript/blob/main/packages/shared/parser/keyword.enum.ts).
+2. Thêm entry vào mảng `KEYWORDS` ở đầu `tokenizer.ts`. Chọn `boundary` đúng:
+   - `WORD` cho keyword ASCII (mô phỏng `\b`).
+   - `IDENT` cho keyword VI kết thúc bằng ký tự non-ASCII (vì `\b` không hoạt động trên Unicode).
+   - `NONE` chỉ khi cố ý cho phép keyword đứng cạnh identifier khác.
+3. Thêm test trong `vietnamese-keywords.test.ts` (1 case dùng EN, 1 case dùng VI).
+4. Chạy `pnpm test` (full suite) + `pnpm bench` (kiểm tra không regression).
+
+## Tài liệu liên quan
+
+- [Roadmap](../roadmap.md) — phần Phase 0 ghi keyword EN + VI policy.
+- [Compatibility matrix](../compatibility.md) — trạng thái cú pháp.
+- Source: `packages/parser/src/tokenizer.ts`, `packages/parser/src/parser.ts`.
diff --git a/docs/getting-started.md b/docs/getting-started.md
index 3f2eda7..1d0bbfb 100644
--- a/docs/getting-started.md
+++ b/docs/getting-started.md
@@ -116,7 +116,11 @@ Dự án phát triển theo TDD. Mỗi cú pháp mới có 3 loại test: parser
 ```bash
 pnpm test              # Chạy toàn bộ test
 pnpm test:coverage     # Test + coverage report
+pnpm bench             # Chạy benchmark tokenizer trên 5 fixture
+pnpm bench:baseline    # Bench → JSON baseline (packages/parser/bench/baseline.json)
 pnpm build             # Build tất cả package
 ```
 
+Kiến trúc tokenizer (state machine + keyword trie + bounded backtracking): xem [Kiến trúc Tokenizer](./architecture/tokenizer).
+
 Hướng dẫn thêm parser node: [CONTRIBUTING.md](../CONTRIBUTING.md).
diff --git a/docs/roadmap.md b/docs/roadmap.md
index 8703d8d..2e26d62 100644
--- a/docs/roadmap.md
+++ b/docs/roadmap.md
@@ -4,6 +4,8 @@ Kế hoạch đầy đủ đưa dự án từ trạng thái hiện tại (parser
 
 Tham chiếu trạng thái: [compatibility.md](./compatibility.md).
 
+Kiến trúc tokenizer được mô tả tại [architecture/tokenizer.md](./architecture/tokenizer.md).
+
 ---
 
 ## 0. Nguyên tắc & quyết định cố định
diff --git a/packages/parser/bench/baseline.json b/packages/parser/bench/baseline.json
index bccbd19..554141e 100644
--- a/packages/parser/bench/baseline.json
+++ b/packages/parser/bench/baseline.json
@@ -21,132 +21,132 @@
       "filepath": "/home/user/vietscript/packages/parser/src/__bench__/tokenizer.bench.ts",
       "groups": [
         {
-          "fullName": "packages/parser/src/__bench__/tokenizer.bench.ts > tokenizer (regex baseline)",
+          "fullName": "packages/parser/src/__bench__/tokenizer.bench.ts > tokenizer",
           "benchmarks": [
             {
               "id": "-1068277940_0_0",
               "name": "tokenize tiny (129 chars)",
               "rank": 1,
-              "rme": 1.1426219639578934,
+              "rme": 1.141015862752919,
               "samples": [],
-              "totalTime": 500.33317099999954,
-              "min": 0.4448810000001231,
-              "max": 1.2078570000001037,
-              "hz": 1964.6908439736467,
-              "period": 0.5089859318413017,
-              "mean": 0.5089859318413017,
-              "variance": 0.008654820571585425,
-              "sd": 0.0930312881324634,
-              "sem": 0.002967237270752279,
-              "df": 982,
+              "totalTime": 500.00185199994417,
+              "min": 0.0020930000000589644,
+              "max": 0.7843789999999444,
+              "hz": 299578.8903598236,
+              "period": 0.003338018906468684,
+              "mean": 0.003338018906468684,
+              "variance": 0.000056562894653669586,
+              "sd": 0.007520830715663635,
+              "sem": 0.0000194323087880098,
+              "df": 149789,
               "critical": 1.96,
-              "moe": 0.005815785050674467,
-              "p75": 0.5024739999998928,
-              "p99": 0.8974330000000919,
-              "p995": 0.955120000000079,
-              "p999": 1.2078570000001037,
-              "sampleCount": 983,
-              "median": 0.4800920000000133
+              "moe": 0.000038087325224499204,
+              "p75": 0.003083000000060565,
+              "p99": 0.013623999999936132,
+              "p995": 0.020438999999896623,
+              "p999": 0.03829700000005687,
+              "sampleCount": 149790,
+              "median": 0.0024290000001201406
             },
             {
               "id": "-1068277940_0_1",
               "name": "tokenize medium (1777 chars)",
               "rank": 3,
-              "rme": 0.6302605202939511,
+              "rme": 1.2727299895304194,
               "samples": [],
-              "totalTime": 516.6511870000006,
-              "min": 19.24686200000042,
-              "max": 20.44344299999989,
-              "hz": 50.324088387316465,
-              "period": 19.871199500000024,
-              "mean": 19.871199500000024,
-              "variance": 0.09610086164718201,
-              "sd": 0.3100013897504042,
-              "sem": 0.06079627444531513,
-              "df": 25,
-              "critical": 2.06,
-              "moe": 0.12524032535734916,
-              "p75": 20.03240300000016,
-              "p99": 20.44344299999989,
-              "p995": 20.44344299999989,
-              "p999": 20.44344299999989,
-              "sampleCount": 26,
-              "median": 19.872763499999905
+              "totalTime": 500.0366130000066,
+              "min": 0.03256499999997686,
+              "max": 1.0443820000000414,
+              "hz": 21186.448601114455,
+              "period": 0.04719998234849977,
+              "mean": 0.04719998234849977,
+              "variance": 0.0009951855316763752,
+              "sd": 0.03154656132887347,
+              "sem": 0.0003064940461236842,
+              "df": 10593,
+              "critical": 1.96,
+              "moe": 0.0006007283304024209,
+              "p75": 0.05080900000029942,
+              "p99": 0.12844799999993484,
+              "p995": 0.15704199999981938,
+              "p999": 0.545332000000144,
+              "sampleCount": 10594,
+              "median": 0.037829999999985375
             },
             {
               "id": "-1068277940_0_2",
               "name": "tokenize keywordHeavy (2511 chars)",
               "rank": 4,
-              "rme": 0.5495901106494003,
+              "rme": 1.2775559544495476,
               "samples": [],
-              "totalTime": 506.61291200000005,
-              "min": 27.58257700000013,
-              "max": 28.69541799999979,
-              "hz": 35.53008534452829,
-              "period": 28.14516177777778,
-              "mean": 28.14516177777778,
-              "variance": 0.09673706615570357,
-              "sd": 0.3110258287597729,
-              "sem": 0.07330949088006712,
-              "df": 17,
-              "critical": 2.11,
-              "moe": 0.15468302575694162,
-              "p75": 28.271655999999894,
-              "p99": 28.69541799999979,
-              "p995": 28.69541799999979,
-              "p999": 28.69541799999979,
-              "sampleCount": 18,
-              "median": 28.091197499999907
+              "totalTime": 500.01672599998346,
+              "min": 0.040463000000272586,
+              "max": 1.1820819999998093,
+              "hz": 18065.395676384433,
+              "period": 0.055354447691794914,
+              "mean": 0.055354447691794914,
+              "variance": 0.0011759389410946788,
+              "sd": 0.03429196613049008,
+              "sem": 0.00036080818496897254,
+              "df": 9032,
+              "critical": 1.96,
+              "moe": 0.0007071840425391862,
+              "p75": 0.05822900000021036,
+              "p99": 0.13829299999997602,
+              "p995": 0.17417199999999866,
+              "p999": 0.578490000000329,
+              "sampleCount": 9033,
+              "median": 0.04572200000029625
             },
             {
               "id": "-1068277940_0_3",
               "name": "tokenize stringHeavy (1410 chars)",
               "rank": 2,
-              "rme": 0.6673262731198819,
+              "rme": 1.1398010880120715,
               "samples": [],
-              "totalTime": 500.565410000002,
-              "min": 6.132623999999851,
-              "max": 7.134853000000021,
-              "hz": 149.83056859641923,
-              "period": 6.674205466666693,
-              "mean": 6.674205466666693,
-              "variance": 0.037456074319165325,
-              "sd": 0.193535718458287,
-              "sem": 0.02234757982993992,
-              "df": 74,
-              "critical": 1.993,
-              "moe": 0.044538726601070264,
-              "p75": 6.781645000000026,
-              "p99": 7.134853000000021,
-              "p995": 7.134853000000021,
-              "p999": 7.134853000000021,
-              "sampleCount": 75,
-              "median": 6.676206999999977
+              "totalTime": 500.08148599995,
+              "min": 0.010930999999800406,
+              "max": 0.9104910000000928,
+              "hz": 67796.95099530918,
+              "period": 0.014749925849455817,
+              "mean": 0.014749925849455817,
+              "variance": 0.0002494460350087862,
+              "sd": 0.015793860674603477,
+              "sem": 0.00008577541597605673,
+              "df": 33903,
+              "critical": 1.96,
+              "moe": 0.00016811981531307117,
+              "p75": 0.013374000000112574,
+              "p99": 0.04493300000012823,
+              "p995": 0.05427600000029997,
+              "p999": 0.08798699999988457,
+              "sampleCount": 33904,
+              "median": 0.01170300000012503
             },
             {
               "id": "-1068277940_0_4",
               "name": "tokenize large (14262 chars)",
               "rank": 5,
-              "rme": 1.1911127448688184,
+              "rme": 1.6847795453088072,
               "samples": [],
-              "totalTime": 8981.281721999996,
-              "min": 888.0166659999995,
-              "max": 939.8688099999999,
-              "hz": 1.1134268258732618,
-              "period": 898.1281721999997,
-              "mean": 898.1281721999997,
-              "variance": 223.66456306729435,
-              "sd": 14.955419187281056,
-              "sem": 4.729318799439242,
-              "df": 9,
-              "critical": 2.262,
-              "moe": 10.697719124331565,
-              "p75": 896.0057759999981,
-              "p99": 939.8688099999999,
-              "p995": 939.8688099999999,
-              "p999": 939.8688099999999,
-              "sampleCount": 10,
-              "median": 893.9150835
+              "totalTime": 500.3381490000138,
+              "min": 0.2647550000001502,
+              "max": 1.21205800000007,
+              "hz": 2808.1008869862553,
+              "period": 0.35611256156584614,
+              "mean": 0.35611256156584614,
+              "variance": 0.013165123286550166,
+              "sd": 0.11473937112669812,
+              "sem": 0.0030610773446615356,
+              "df": 1404,
+              "critical": 1.96,
+              "moe": 0.005999711595536609,
+              "p75": 0.3726120000001174,
+              "p99": 0.8343439999998736,
+              "p995": 0.9964389999995547,
+              "p999": 1.1970249999994849,
+              "sampleCount": 1405,
+              "median": 0.3244280000008075
             }
           ]
         }
diff --git a/packages/parser/bench/comparison.json b/packages/parser/bench/comparison.json
deleted file mode 100644
index 31011fc..0000000
--- a/packages/parser/bench/comparison.json
+++ /dev/null
@@ -1,301 +0,0 @@
-{
-  "files": [
-    {
-      "filepath": "/home/user/vietscript/packages/parser/src/__bench__/parser.bench.ts",
-      "groups": [
-        {
-          "fullName": "packages/parser/src/__bench__/parser.bench.ts > parser end-to-end (regex baseline)",
-          "benchmarks": [
-            {
-              "id": "1166627732_0_0",
-              "name": "parse tiny (129 chars)",
-              "rank": 1,
-              "rme": 0,
-              "samples": []
-            }
-          ]
-        }
-      ]
-    },
-    {
-      "filepath": "/home/user/vietscript/packages/parser/src/__bench__/tokenizer.bench.ts",
-      "groups": [
-        {
-          "fullName": "packages/parser/src/__bench__/tokenizer.bench.ts > tokenize tiny (129 chars)",
-          "benchmarks": [
-            {
-              "id": "-1068277940_0_0",
-              "name": "regex",
-              "rank": 2,
-              "rme": 1.5350840371170378,
-              "samples": [],
-              "totalTime": 500.0193509999999,
-              "min": 0.4276030000000901,
-              "max": 1.3723669999999402,
-              "hz": 1769.931500111083,
-              "period": 0.5649936169491524,
-              "mean": 0.5649936169491524,
-              "variance": 0.017329359925297418,
-              "sd": 0.13164102675570946,
-              "sem": 0.0044250647063860315,
-              "df": 884,
-              "critical": 1.96,
-              "moe": 0.008673126824516621,
-              "p75": 0.5542259999999715,
-              "p99": 1.0561989999998787,
-              "p995": 1.1007520000000568,
-              "p999": 1.3723669999999402,
-              "sampleCount": 885,
-              "median": 0.5261639999998806
-            },
-            {
-              "id": "-1068277940_0_1",
-              "name": "fsm",
-              "rank": 1,
-              "rme": 0.6133533190779474,
-              "samples": [],
-              "totalTime": 500.00123599994686,
-              "min": 0.002124999999978172,
-              "max": 0.4388850000000275,
-              "hz": 364125.0998827918,
-              "period": 0.0027463088930751818,
-              "mean": 0.0027463088930751818,
-              "variance": 0.000013447134783211087,
-              "sd": 0.0036670335126926623,
-              "sem": 0.000008594171810106872,
-              "df": 182062,
-              "critical": 1.96,
-              "moe": 0.000016844576747809467,
-              "p75": 0.00261000000000422,
-              "p99": 0.005317999999988388,
-              "p995": 0.0074489999997240375,
-              "p999": 0.023729999999886786,
-              "sampleCount": 182063,
-              "median": 0.0025460000001658045
-            }
-          ]
-        },
-        {
-          "fullName": "packages/parser/src/__bench__/tokenizer.bench.ts > tokenize medium (1777 chars)",
-          "benchmarks": [
-            {
-              "id": "-1068277940_1_0",
-              "name": "regex",
-              "rank": 2,
-              "rme": 3.524667752133275,
-              "samples": [],
-              "totalTime": 518.7961389999991,
-              "min": 19.4718170000001,
-              "max": 28.74305000000004,
-              "hz": 48.188485072746545,
-              "period": 20.751845559999964,
-              "mean": 20.751845559999964,
-              "variance": 3.139571992830767,
-              "sd": 1.7718837413416173,
-              "sem": 0.35437674826832344,
-              "df": 24,
-              "critical": 2.064,
-              "moe": 0.7314336084258196,
-              "p75": 20.659483999999793,
-              "p99": 28.74305000000004,
-              "p995": 28.74305000000004,
-              "p999": 28.74305000000004,
-              "sampleCount": 25,
-              "median": 20.37862300000006
-            },
-            {
-              "id": "-1068277940_1_1",
-              "name": "fsm",
-              "rank": 1,
-              "rme": 0.6624735904903833,
-              "samples": [],
-              "totalTime": 500.0035550000025,
-              "min": 0.03033499999992273,
-              "max": 0.5823090000003504,
-              "hz": 26333.812766591098,
-              "period": 0.03797399217741342,
-              "mean": 0.03797399217741342,
-              "variance": 0.0002169123830559907,
-              "sd": 0.014727945649546331,
-              "sem": 0.00012835085175012654,
-              "df": 13166,
-              "critical": 1.96,
-              "moe": 0.000251567669430248,
-              "p75": 0.036670999999842024,
-              "p99": 0.06638299999985975,
-              "p995": 0.0816330000002381,
-              "p999": 0.2839810000000398,
-              "sampleCount": 13167,
-              "median": 0.035983999999643856
-            }
-          ]
-        },
-        {
-          "fullName": "packages/parser/src/__bench__/tokenizer.bench.ts > tokenize keywordHeavy (2511 chars)",
-          "benchmarks": [
-            {
-              "id": "-1068277940_2_0",
-              "name": "regex",
-              "rank": 2,
-              "rme": 1.7877956419477135,
-              "samples": [],
-              "totalTime": 515.2327339999997,
-              "min": 29.07310099999995,
-              "max": 33.27790499999992,
-              "hz": 32.99479803625988,
-              "period": 30.307807882352925,
-              "mean": 30.307807882352925,
-              "variance": 1.1105087871830166,
-              "sd": 1.053806807333781,
-              "sem": 0.2555856926842411,
-              "df": 16,
-              "critical": 2.12,
-              "moe": 0.5418416684905911,
-              "p75": 30.90304600000036,
-              "p99": 33.27790499999992,
-              "p995": 33.27790499999992,
-              "p999": 33.27790499999992,
-              "sampleCount": 17,
-              "median": 30.187858000000233
-            },
-            {
-              "id": "-1068277940_2_1",
-              "name": "fsm",
-              "rank": 1,
-              "rme": 0.6175740445255259,
-              "samples": [],
-              "totalTime": 500.0284290000027,
-              "min": 0.04476100000010774,
-              "max": 0.5986629999997604,
-              "hz": 20106.856764337987,
-              "period": 0.04973427779988091,
-              "mean": 0.04973427779988091,
-              "variance": 0.0002468973565791904,
-              "sd": 0.015712967783941722,
-              "sem": 0.00015670713822667616,
-              "df": 10053,
-              "critical": 1.96,
-              "moe": 0.00030714599092428527,
-              "p75": 0.047118000000409666,
-              "p99": 0.0876379999999699,
-              "p995": 0.10085900000012771,
-              "p999": 0.3146649999998772,
-              "sampleCount": 10054,
-              "median": 0.04607899999973597
-            }
-          ]
-        },
-        {
-          "fullName": "packages/parser/src/__bench__/tokenizer.bench.ts > tokenize stringHeavy (1410 chars)",
-          "benchmarks": [
-            {
-              "id": "-1068277940_3_0",
-              "name": "regex",
-              "rank": 2,
-              "rme": 0.7766105799152846,
-              "samples": [],
-              "totalTime": 503.6576939999986,
-              "min": 6.501181999999972,
-              "max": 7.724500999999691,
-              "hz": 142.95423430978144,
-              "period": 6.995245749999981,
-              "mean": 6.995245749999981,
-              "variance": 0.05341669000115708,
-              "sd": 0.2311205096938761,
-              "sem": 0.027237813279305165,
-              "df": 71,
-              "critical": 1.9945,
-              "moe": 0.05432581858557415,
-              "p75": 7.049482999999782,
-              "p99": 7.724500999999691,
-              "p995": 7.724500999999691,
-              "p999": 7.724500999999691,
-              "sampleCount": 72,
-              "median": 6.967653500000324
-            },
-            {
-              "id": "-1068277940_3_1",
-              "name": "fsm",
-              "rank": 1,
-              "rme": 0.5493915324505317,
-              "samples": [],
-              "totalTime": 500.00108399988767,
-              "min": 0.01133399999980611,
-              "max": 0.5432410000003074,
-              "hz": 69159.8500614606,
-              "period": 0.014459256333137296,
-              "mean": 0.014459256333137296,
-              "variance": 0.00005680266124359761,
-              "sd": 0.007536754025679597,
-              "sem": 0.0000405295560967212,
-              "df": 34579,
-              "critical": 1.96,
-              "moe": 0.00007943792994957356,
-              "p75": 0.01378500000009808,
-              "p99": 0.03121000000010099,
-              "p995": 0.03526799999963259,
-              "p999": 0.052106000000094355,
-              "sampleCount": 34580,
-              "median": 0.013637999999446038
-            }
-          ]
-        },
-        {
-          "fullName": "packages/parser/src/__bench__/tokenizer.bench.ts > tokenize large (14262 chars)",
-          "benchmarks": [
-            {
-              "id": "-1068277940_4_0",
-              "name": "regex",
-              "rank": 2,
-              "rme": 0.9425965082975333,
-              "samples": [],
-              "totalTime": 9050.40537,
-              "min": 884.817156000001,
-              "max": 926.1038919999992,
-              "hz": 1.104922883691783,
-              "period": 905.0405370000001,
-              "mean": 905.0405370000001,
-              "variance": 142.2337026237795,
-              "sd": 11.92617720075379,
-              "sem": 3.7713883733153164,
-              "df": 9,
-              "critical": 2.262,
-              "moe": 8.530880500439245,
-              "p75": 914.2840939999987,
-              "p99": 926.1038919999992,
-              "p995": 926.1038919999992,
-              "p999": 926.1038919999992,
-              "sampleCount": 10,
-              "median": 903.8145745000011
-            },
-            {
-              "id": "-1068277940_4_1",
-              "name": "fsm",
-              "rank": 1,
-              "rme": 0.7733163929814905,
-              "samples": [],
-              "totalTime": 500.2985060000319,
-              "min": 0.24359799999729148,
-              "max": 0.892772999999579,
-              "hz": 3280.041775699197,
-              "period": 0.30487416575260934,
-              "mean": 0.30487416575260934,
-              "variance": 0.002374390365389728,
-              "sd": 0.04872771660348685,
-              "sem": 0.0012028785212910658,
-              "df": 1640,
-              "critical": 1.96,
-              "moe": 0.002357641901730489,
-              "p75": 0.31062000000019907,
-              "p99": 0.5434040000000095,
-              "p995": 0.5940250000021479,
-              "p999": 0.6961689999989176,
-              "sampleCount": 1641,
-              "median": 0.2921910000004573
-            }
-          ]
-        }
-      ]
-    }
-  ]
-}
\ No newline at end of file
diff --git a/packages/parser/src/__bench__/parser.bench.ts b/packages/parser/src/__bench__/parser.bench.ts
index eb2766e..466f67f 100644
--- a/packages/parser/src/__bench__/parser.bench.ts
+++ b/packages/parser/src/__bench__/parser.bench.ts
@@ -7,7 +7,7 @@ import { fixtures } from './fixtures'
 // The other fixtures still exercise the tokenizer (see tokenizer.bench.ts).
 const PARSE_OK = ['tiny'] as const
 
-describe('parser end-to-end (regex baseline)', () => {
+describe('parser end-to-end', () => {
   for (const name of PARSE_OK) {
     const source = fixtures[name]
     bench(`parse ${name} (${source.length} chars)`, () => {
diff --git a/packages/parser/src/__bench__/tokenizer.bench.ts b/packages/parser/src/__bench__/tokenizer.bench.ts
index f55bbf4..dff3f19 100644
--- a/packages/parser/src/__bench__/tokenizer.bench.ts
+++ b/packages/parser/src/__bench__/tokenizer.bench.ts
@@ -1,12 +1,11 @@
 import type { Token } from '@vietscript/shared'
 import { Parser } from '@parser/parser'
 import { Tokenizer } from '@parser/tokenizer'
-import { TokenizerFSM } from '@parser/tokenizer-fsm'
 import { bench, describe } from 'vitest'
 
 import { fixtures } from './fixtures'
 
-function tokenizeRegex(source: string): number {
+function tokenizeAll(source: string): number {
   const parser = new Parser()
   parser.syntax = source
   parser.tokenizer = new Tokenizer(parser)
@@ -19,29 +18,13 @@ function tokenizeRegex(source: string): number {
   return count
 }
 
-function tokenizeFSM(source: string): number {
-  const parser = new Parser()
-  parser.syntax = source
-  parser.tokenizer = new TokenizerFSM(parser)
-  let count = 0
-  let tok: Token | null = parser.tokenizer.getNextToken()
-  while (tok !== null) {
-    count++
-    tok = parser.tokenizer.getNextToken()
-  }
-  return count
-}
-
 const NAMES = ['tiny', 'medium', 'keywordHeavy', 'stringHeavy', 'large'] as const
 
-for (const name of NAMES) {
-  const source = fixtures[name]
-  describe(`tokenize ${name} (${source.length} chars)`, () => {
-    bench('regex', () => {
-      tokenizeRegex(source)
+describe('tokenizer', () => {
+  for (const name of NAMES) {
+    const source = fixtures[name]
+    bench(`tokenize ${name} (${source.length} chars)`, () => {
+      tokenizeAll(source)
     })
-    bench('fsm', () => {
-      tokenizeFSM(source)
-    })
-  })
-}
+  }
+})
diff --git a/packages/parser/src/__test__/tokenizer-fsm.test.ts b/packages/parser/src/__test__/tokenizer-fsm.test.ts
deleted file mode 100644
index 57b373f..0000000
--- a/packages/parser/src/__test__/tokenizer-fsm.test.ts
+++ /dev/null
@@ -1,232 +0,0 @@
-import type { Token } from '@vietscript/shared'
-import { Parser } from '@parser/parser'
-
-import { Tokenizer } from '@parser/tokenizer'
-import { TokenizerFSM } from '@parser/tokenizer-fsm'
-import { Keyword } from '@vietscript/shared'
-
-import { fixtures } from '../__bench__/fixtures'
-
-function tokenizeRegex(source: string): Token[] {
-  const parser = new Parser()
-  parser.syntax = source
-  parser.tokenizer = new Tokenizer(parser)
-  const out: Token[] = []
-  let t = parser.tokenizer.getNextToken()
-  while (t !== null) {
-    out.push(t)
-    t = parser.tokenizer.getNextToken()
-  }
-  return out
-}
-
-function tokenizeFSM(source: string): Token[] {
-  const parser = new Parser()
-  parser.syntax = source
-  parser.tokenizer = new TokenizerFSM(parser)
-  const out: Token[] = []
-  let t = parser.tokenizer.getNextToken()
-  while (t !== null) {
-    out.push(t)
-    t = parser.tokenizer.getNextToken()
-  }
-  return out
-}
-
-describe('tokenizer-fsm: parity with regex tokenizer', () => {
-  it('empty input', () => {
-    expect(tokenizeFSM('')).toEqual(tokenizeRegex(''))
-  })
-
-  it('whitespace only', () => {
-    expect(tokenizeFSM('   \n\t')).toEqual(tokenizeRegex('   \n\t'))
-  })
-
-  it('line comment', () => {
-    expect(tokenizeFSM('// hello\n42')).toEqual(tokenizeRegex('// hello\n42'))
-  })
-
-  it('block comment', () => {
-    expect(tokenizeFSM('/* a\nb */ x')).toEqual(tokenizeRegex('/* a\nb */ x'))
-  })
-
-  it('integer literal', () => {
-    expect(tokenizeFSM('42')).toEqual(tokenizeRegex('42'))
-  })
-
-  it('hex literal', () => {
-    expect(tokenizeFSM('0xFF')).toEqual(tokenizeRegex('0xFF'))
-  })
-
-  it('octal literal', () => {
-    expect(tokenizeFSM('0o17')).toEqual(tokenizeRegex('0o17'))
-  })
-
-  it('binary literal', () => {
-    expect(tokenizeFSM('0b1010')).toEqual(tokenizeRegex('0b1010'))
-  })
-
-  it('decimal with fraction', () => {
-    expect(tokenizeFSM('3.14')).toEqual(tokenizeRegex('3.14'))
-  })
-
-  it('exponent literal', () => {
-    expect(tokenizeFSM('1.5e10')).toEqual(tokenizeRegex('1.5e10'))
-  })
-
-  it('bigint literal', () => {
-    expect(tokenizeFSM('100n')).toEqual(tokenizeRegex('100n'))
-  })
-
-  it('numeric separators', () => {
-    expect(tokenizeFSM('1_000_000')).toEqual(tokenizeRegex('1_000_000'))
-  })
-
-  it('leading dot decimal', () => {
-    expect(tokenizeFSM('.5')).toEqual(tokenizeRegex('.5'))
-  })
-
-  it('double-quoted string', () => {
-    expect(tokenizeFSM('"hello"')).toEqual(tokenizeRegex('"hello"'))
-  })
-
-  it('single-quoted string', () => {
-    expect(tokenizeFSM('\'hello\'')).toEqual(tokenizeRegex('\'hello\''))
-  })
-
-  it('string with escape', () => {
-    expect(tokenizeFSM('"a\\"b"')).toEqual(tokenizeRegex('"a\\"b"'))
-  })
-
-  it('string with unicode escape', () => {
-    expect(tokenizeFSM('"\\u00E1"')).toEqual(tokenizeRegex('"\\u00E1"'))
-  })
-
-  it('template literal simple', () => {
-    expect(tokenizeFSM('`abc`')).toEqual(tokenizeRegex('`abc`'))
-  })
-
-  it('template literal with interpolation', () => {
-    expect(tokenizeFSM('`a${b}c`')).toEqual(tokenizeRegex('`a${b}c`'))
-  })
-
-  it('nested template literal', () => {
-    expect(tokenizeFSM('`a${`b${c}d`}e`')).toEqual(tokenizeRegex('`a${`b${c}d`}e`'))
-  })
-
-  it('regex literal', () => {
-    expect(tokenizeFSM('x = /abc/g')).toEqual(tokenizeRegex('x = /abc/g'))
-  })
-
-  it('division vs regex (after identifier)', () => {
-    expect(tokenizeFSM('x / y')).toEqual(tokenizeRegex('x / y'))
-  })
-
-  it('regex with character class', () => {
-    expect(tokenizeFSM('var r = /[a-z]+/i')).toEqual(tokenizeRegex('var r = /[a-z]+/i'))
-  })
-
-  it('all single-char operators', () => {
-    expect(tokenizeFSM('+-*/%~!&|^?:.,;')).toEqual(tokenizeRegex('+-*/%~!&|^?:.,;'))
-  })
-
-  it('compound operator >>>=', () => {
-    expect(tokenizeFSM('a >>>= b')).toEqual(tokenizeRegex('a >>>= b'))
-  })
-
-  it('arrow function tokens', () => {
-    expect(tokenizeFSM('(a, b) => a + b')).toEqual(tokenizeRegex('(a, b) => a + b'))
-  })
-
-  it('all bracket types', () => {
-    expect(tokenizeFSM('([{}])')).toEqual(tokenizeRegex('([{}])'))
-  })
-
-  it('english keyword: var', () => {
-    expect(tokenizeFSM('var x = 1')).toEqual(tokenizeRegex('var x = 1'))
-  })
-
-  it('vietnamese keyword: khai báo', () => {
-    expect(tokenizeFSM('khai báo x = 1')).toEqual(tokenizeRegex('khai báo x = 1'))
-  })
-
-  it('vietnamese keyword: phá vòng lặp', () => {
-    expect(tokenizeFSM('phá vòng lặp')).toEqual(tokenizeRegex('phá vòng lặp'))
-  })
-
-  it('vietnamese keyword: kiểu của', () => {
-    expect(tokenizeFSM('kiểu của x')).toEqual(tokenizeRegex('kiểu của x'))
-  })
-
-  it('multi-word identifier (no embedded keyword)', () => {
-    expect(tokenizeFSM('khai báo con mèo đẹp = 1')).toEqual(tokenizeRegex('khai báo con mèo đẹp = 1'))
-  })
-
-  it('embedded keyword inside identifier should split', () => {
-    const fsm = tokenizeFSM('khai báo một lớp gì đó = 1')
-    const rx = tokenizeRegex('khai báo một lớp gì đó = 1')
-    expect(fsm).toEqual(rx)
-  })
-
-  it('boolean keywords vi/en', () => {
-    expect(tokenizeFSM('đúng sai true false')).toEqual(tokenizeRegex('đúng sai true false'))
-  })
-
-  it('null/Infinity/NaN/undefined', () => {
-    expect(tokenizeFSM('null Infinity NaN undefined rỗng vô cực không xác định')).toEqual(tokenizeRegex('null Infinity NaN undefined rỗng vô cực không xác định'))
-  })
-
-  it('throw on unknown char', () => {
-    expect(() => tokenizeFSM('@@@')).toThrow()
-  })
-
-  it('riêng tư followed by identifier should not consume trailing letters', () => {
-    const t = tokenizeFSM('riêng tư x')
-    expect(t).toHaveLength(2)
-    expect(t[0].type).toBe(Keyword.PRIVATE)
-    expect(t[1].value).toBe('x')
-  })
-
-  it('bảo vệ keyword followed by identifier', () => {
-    const t = tokenizeFSM('bảo vệ phương thức')
-    expect(t).toHaveLength(2)
-    expect(t[0].type).toBe(Keyword.PROTECTED)
-    expect(t[1].type).toBe(Keyword.IDENTIFIER)
-  })
-})
-
-describe('tokenizer-fsm: produces identical output to regex tokenizer on bench fixtures', () => {
-  for (const [name, source] of Object.entries(fixtures)) {
-    it(`fixture parity: ${name}`, () => {
-      const rx = tokenizeRegex(source)
-      const fsm = tokenizeFSM(source)
-      expect(fsm.length).toBe(rx.length)
-      for (let i = 0; i < rx.length; i++) {
-        expect({ idx: i, ...fsm[i] }).toEqual({ idx: i, ...rx[i] })
-      }
-    }, 60_000)
-  }
-})
-
-describe('tokenizer-fsm: rollback works', () => {
-  it('rollback decrements cursor and lookahead end', () => {
-    const parser = new Parser()
-    parser.syntax = 'abc; def'
-    parser.tokenizer = new TokenizerFSM(parser)
-    const first = parser.tokenizer.getNextToken()
-    expect(first?.value).toBe('abc')
-    const before = (parser.tokenizer as TokenizerFSM).getCursor()
-    ;(parser.tokenizer as TokenizerFSM).rollback(2)
-    expect((parser.tokenizer as TokenizerFSM).getCursor()).toBe(before - 2)
-  })
-})
-
-describe('tokenizer-fsm: isEOF', () => {
-  it('is true at end', () => {
-    const parser = new Parser()
-    parser.syntax = 'a'
-    parser.tokenizer = new TokenizerFSM(parser)
-    parser.tokenizer.getNextToken()
-    expect(parser.tokenizer.isEOF()).toBe(true)
-  })
-})
diff --git a/packages/parser/src/constants/specs.ts b/packages/parser/src/constants/specs.ts
deleted file mode 100644
index 5f64172..0000000
--- a/packages/parser/src/constants/specs.ts
+++ /dev/null
@@ -1,159 +0,0 @@
-import type { Spec } from '@vietscript/shared'
-import { Keyword } from '@vietscript/shared'
-
-export const SpecIdentifier = [/^[A-Za-z\u00C0-\u1EF9][A-Za-z0-9\u00C0-\u1EF9]*(\s[A-Za-z\u00C0-\u1EF9][A-Za-z0-9\u00C0-\u1EF9]*)*/, Keyword.IDENTIFIER] as Spec
-
-export const Specs: Array<Spec> = [
-  // --------------------------------------
-  // Whitespace:
-  [/^\s+/, null],
-
-  // --------------------------------------
-  // Comments:
-  [/^\/\/.*/, null],
-  [/^\/\*[\s\S]*?\*\//, null],
-
-  // --------------------------------------
-  // Symbols and delimiters (ordered longest-first to avoid prefix conflicts):
-  [/^\[/, '['],
-  [/^\]/, ']'],
-  [/^\(/, '('],
-  [/^\)/, ')'],
-  [/^\{/, '{'],
-  [/^\}/, '}'],
-  [/^;/, ';'],
-  [/^,/, ','],
-  [/^:/, ':'],
-  [/^\.{3}/, '...'],
-  [/^\.[\d_]+([eE][+-]?\d[\d_]*)?n?/, Keyword.NUMBER],
-  [/^\./, '.'],
-  [/^#/, '#'],
-
-  [/^>>>=/, '>>>='],
-  [/^>>>/, '>>>'],
-  [/^>>=/, '>>='],
-  [/^>>/, '>>'],
-  [/^<<=/, '<<='],
-  [/^<</, '<<'],
-  [/^<=/, '<='],
-  [/^>=/, '>='],
-  [/^</, '<'],
-  [/^>/, '>'],
-
-  [/^===/, '==='],
-  [/^!==/, '!=='],
-  [/^==/, '=='],
-  [/^!=/, '!='],
-
-  [/^=>/, '=>'],
-  [/^\*\*=/, '**='],
-  [/^\*\*/, '**'],
-  [/^\*=/, '*='],
-  [/^\*/, '*'],
-  [/^\+\+/, '++'],
-  [/^\+=/, '+='],
-  [/^\+/, '+'],
-  [/^--/, '--'],
-  [/^-=/, '-='],
-  [/^-/, '-'],
-  [/^\/=/, '/='],
-  [/^\//, '/'],
-  [/^%=/, '%='],
-  [/^%/, '%'],
-
-  [/^&&=/, '&&='],
-  [/^&&/, '&&'],
-  [/^&=/, '&='],
-  [/^&/, '&'],
-  [/^\|\|=/, '||='],
-  [/^\|\|/, '||'],
-  [/^\|=/, '|='],
-  [/^\|/, '|'],
-  [/^\^=/, '^='],
-  [/^\^/, '^'],
-  [/^~/, '~'],
-  [/^!/, '!'],
-
-  [/^\?\?=/, '??='],
-  [/^\?\?/, '??'],
-  [/^\?\./, '?.'],
-  [/^\?/, '?'],
-
-  [/^=/, '='],
-
-  // --------------------------------------
-  // Keywords
-  [/^(var|khai b\u00E1o)\b/, Keyword.VAR],
-  [/^(break|ph\u00E1 v\u00F2ng l\u1EB7p)\b/, Keyword.BREAK],
-  [/^(do|th\u1EF1c hi\u1EC7n)\b/, Keyword.DO],
-  [/^(instanceof|l\u00E0 ki\u1EC3u)\b/, Keyword.INSTANCEOF],
-  [/^(typeof|ki\u1EC3u c\u1EE7a)\b/, Keyword.TYPEOF],
-  [/^(switch|duy\u1EC7t)\b/, Keyword.SWITCH],
-  [/^(case|tr\u01B0\u1EDDng h\u1EE3p)\b/, Keyword.CASE],
-  [/^(if|n\u1EBFu)\b/, Keyword.IF],
-  [/^(else|kh\u00F4ng th\u00EC)/, Keyword.ELSE],
-  [/^new\b/, Keyword.NEW],
-  [/^(catch|b\u1EAFt l\u1ED7i)\b/, Keyword.CATCH],
-  [/^(finally|cu\u1ED1i c\u00F9ng)\b/, Keyword.FINALLY],
-  [/^(return|tr\u1EA3 v\u1EC1)/, Keyword.RETURN],
-  [/^void\b/, Keyword.VOID],
-  [/^(continue|ti\u1EBFp t\u1EE5c)\b/, Keyword.CONTINUE],
-  [/^(for|l\u1EB7p)\b/, Keyword.FOR],
-  [/^(while\b|khi m\u00E0(?![A-Za-z\u00C0-\u1EF9]))/, Keyword.WHILE],
-  [/^debugger\b/, Keyword.DEBUGGER],
-  [/^(function|h\u00E0m)\b/, Keyword.FUNCTION],
-  [/^(this\b|\u0111\u00E2y\b)/, Keyword.THIS],
-  [/^with\b/, Keyword.WITH],
-  [/^(default|m\u1EB7c \u0111\u1ECBnh)\b/, Keyword.DEFAULT],
-  [/^(throw|b\u00E1o l\u1ED7i)\b/, Keyword.THROW],
-  [/^(delete\b|xo\u00E1(?![A-Za-z\u00C0-\u1EF9]))/, Keyword.DELETE],
-  [/^(in|trong)\b/, Keyword.IN],
-  [/^(of|c\u1EE7a)\b/, Keyword.OF],
-  [/^(try|th\u1EED)/, Keyword.TRY],
-  [/^(as|nh\u01B0 l\u00E0)/, Keyword.AS],
-  [/^(from|t\u1EEB)/, Keyword.FROM],
-
-  // --------------------------------------
-  // Future Reserved Words
-  [/^const|h\u1EB1ng s\u1ED1/, Keyword.CONST],
-  [/^(class|l\u1EDBp)\b/, Keyword.CLASS],
-  [/^(super|kh\u1EDFi t\u1EA1o cha)\b/, Keyword.SUPER],
-  [/^(constructor|kh\u1EDFi t\u1EA1o)\b/, Keyword.CONSTRUCTOR],
-  [/^(extends|k\u1EBF th\u1EEBa)\b/, Keyword.EXTENDS],
-  [/^(export|cho ph\u00E9p)\b/, Keyword.EXPORT],
-  [/^(import|s\u1EED d\u1EE5ng)\b/, Keyword.IMPORT],
-  [/^(async|b\u1EA5t \u0111\u1ED3ng b\u1ED9)/, Keyword.ASYNC],
-  [/^(await\b|ch\u1EDD(?![A-Za-z\u00C0-\u1EF9]))/, Keyword.AWAIT],
-  [/^(yield|nh\u01B0\u1EDDng)\b/, Keyword.YIELD],
-  [/^(let|bi\u1EBFn)\b/, Keyword.LET],
-  [/^(private\b|ri\u00EAng t\u01B0(?![A-Za-z\u00C0-\u1EF9]))/, Keyword.PRIVATE],
-  [/^(public|c\u00F4ng khai)\b/, Keyword.PUBLIC],
-  [/^(protected\b|b\u1EA3o v\u1EC7(?![A-Za-z\u00C0-\u1EF9]))/, Keyword.PROTECTED],
-  [/^(static|t\u0129nh)\b/, Keyword.STATIC],
-  [/^(get|l\u1EA5y)\b/, Keyword.GET],
-  [/^(set|g\u00E1n)\b/, Keyword.SET],
-
-  // --------------------------------------
-  // Numbers (order matters: hex/oct/bin before decimal):
-  [/^0[xX][0-9a-fA-F][0-9a-fA-F_]*n?/, Keyword.NUMBER],
-  [/^0[oO][0-7][0-7_]*n?/, Keyword.NUMBER],
-  [/^0[bB][01][01_]*n?/, Keyword.NUMBER],
-  [/^(\d[\d_]*(\.[\d_]*)?|\.[\d_]+)([eE][+-]?\d[\d_]*)?n?/, Keyword.NUMBER],
-
-  // --------------------------------------
-  // Strings (with escape support):
-  [/^"(?:\\[\s\S]|[^"\\])*"/, Keyword.STRING],
-  [/^'(?:\\[\s\S]|[^'\\])*'/, Keyword.STRING],
-
-  // --------------------------------------
-  // Literal with Keyword:
-  [/^(null|r\u1ED7ng)\b/, Keyword.NULL],
-  [/^NaN\b/, Keyword.NAN],
-  [/^(Infinity|v\u00F4 c\u1EF1c)\b/, Keyword.INFINITY],
-  [/^(undefined|kh\u00F4ng x\u00E1c \u0111\u1ECBnh)\b/, Keyword.UNDEFINED],
-  [/(true|false|\u0111\u00FAng|sai)\b/, Keyword.BOOLEAN],
-
-  // --------------------------------------
-  // Identifier
-  SpecIdentifier,
-]
diff --git a/packages/parser/src/index.ts b/packages/parser/src/index.ts
index aa68b96..e855540 100644
--- a/packages/parser/src/index.ts
+++ b/packages/parser/src/index.ts
@@ -6,9 +6,7 @@ export default parser
 
 export { VietScriptError } from './errors'
 export { Parser } from './parser'
-export type { ITokenizer, ParserOptions, TokenizerKind } from './parser'
 export { Tokenizer } from './tokenizer'
-export { TokenizerFSM } from './tokenizer-fsm'
 
 if (typeof window !== 'undefined') {
   (window as unknown as { VietScript: { parser: Parser } }).VietScript = { parser }
diff --git a/packages/parser/src/nodes/literals/TemplateLiteral.ts b/packages/parser/src/nodes/literals/TemplateLiteral.ts
index 4df5a45..720e8b5 100644
--- a/packages/parser/src/nodes/literals/TemplateLiteral.ts
+++ b/packages/parser/src/nodes/literals/TemplateLiteral.ts
@@ -1,4 +1,5 @@
-import { createTokenizer, Parser } from '@parser/parser'
+import { Parser } from '@parser/parser'
+import { Tokenizer } from '@parser/tokenizer'
 
 import { Expression } from '../expressions/Expression'
 
@@ -122,9 +123,9 @@ export class TemplateLiteral {
     }
 
     for (const exprSource of expressions) {
-      const subParser = new Parser({ tokenizer: parser.tokenizerKind })
+      const subParser = new Parser()
       subParser.syntax = exprSource
-      subParser.tokenizer = createTokenizer(subParser)
+      subParser.tokenizer = new Tokenizer(subParser)
       subParser.lookahead = subParser.tokenizer.getNextToken()
       this.expressions.push(new Expression(subParser))
     }
diff --git a/packages/parser/src/parser.ts b/packages/parser/src/parser.ts
index a695ee1..16bfa6c 100644
--- a/packages/parser/src/parser.ts
+++ b/packages/parser/src/parser.ts
@@ -4,32 +4,11 @@ import { Keyword } from '@vietscript/shared'
 import { VietScriptError } from './errors'
 import { Program } from './nodes/Program'
 import { Tokenizer } from './tokenizer'
-import { TokenizerFSM } from './tokenizer-fsm'
-
-export type TokenizerKind = 'regex' | 'fsm'
-
-export interface ITokenizer {
-  getNextToken: () => Token | null
-  isEOF: () => boolean
-  rollback: (step: number) => number
-}
-
-export interface ParserOptions {
-  tokenizer?: TokenizerKind
-}
-
-export function createTokenizer(parser: Parser): ITokenizer {
-  return parser.tokenizerKind === 'fsm'
-    ? new TokenizerFSM(parser)
-    : new Tokenizer(parser)
-}
 
 export class Parser {
   public syntax: string
 
-  public tokenizer: ITokenizer
-
-  public tokenizerKind: TokenizerKind
+  public tokenizer: Tokenizer
 
   public lookahead: Token | null
 
@@ -37,16 +16,15 @@ export class Parser {
 
   public ternaryDepth = 0
 
-  constructor(options: ParserOptions = {}) {
+  constructor() {
     this.syntax = ''
-    this.tokenizerKind = options.tokenizer ?? 'fsm'
-    this.tokenizer = createTokenizer(this)
+    this.tokenizer = new Tokenizer(this)
     this.lookahead = null
   }
 
   public parse(syntax: string, InitAtsNodeClass?: new (parser: Parser) => unknown): any {
     this.syntax = syntax
-    this.tokenizer = createTokenizer(this)
+    this.tokenizer = new Tokenizer(this)
     this.lookahead = this.tokenizer.getNextToken()
 
     if (InitAtsNodeClass)
diff --git a/packages/parser/src/tokenizer-fsm.ts b/packages/parser/src/tokenizer-fsm.ts
deleted file mode 100644
index b67709b..0000000
--- a/packages/parser/src/tokenizer-fsm.ts
+++ /dev/null
@@ -1,917 +0,0 @@
-import type { Token } from '@vietscript/shared'
-import type { Parser } from './parser'
-
-import { Keyword } from '@vietscript/shared'
-
-// Boundary kinds (post-keyword check):
-//   WORD: next char must not be in [A-Za-z0-9_] (mimics JS \b after ASCII word).
-//   IDENT: next char must not be in [A-Za-zÀ-ỹ] (mimics negative
-//          lookahead used for VI keywords ending in non-ASCII).
-//   NONE: no boundary check (mimics keywords without \b such as `else`,
-//         `return`, `try`, `as`, `from`, `const`, `async`).
-const Boundary = {
-  WORD: 0,
-  IDENT: 1,
-  NONE: 2,
-} as const
-
-type BoundaryKind = typeof Boundary[keyof typeof Boundary]
-
-interface KeywordEntry {
-  text: string
-  type: Keyword
-  boundary: BoundaryKind
-}
-
-// Mirrors the regex spec table in constants/specs.ts. The FSM walks a trie
-// built from these entries to detect keywords; ordering does not matter here
-// because the trie picks the longest valid match with a passing boundary.
-const KEYWORDS: ReadonlyArray<KeywordEntry> = [
-  { text: 'var', type: Keyword.VAR, boundary: Boundary.WORD },
-  { text: 'khai báo', type: Keyword.VAR, boundary: Boundary.WORD },
-  { text: 'break', type: Keyword.BREAK, boundary: Boundary.WORD },
-  { text: 'phá vòng lặp', type: Keyword.BREAK, boundary: Boundary.WORD },
-  { text: 'do', type: Keyword.DO, boundary: Boundary.WORD },
-  { text: 'thực hiện', type: Keyword.DO, boundary: Boundary.WORD },
-  { text: 'instanceof', type: Keyword.INSTANCEOF, boundary: Boundary.WORD },
-  { text: 'là kiểu', type: Keyword.INSTANCEOF, boundary: Boundary.WORD },
-  { text: 'typeof', type: Keyword.TYPEOF, boundary: Boundary.WORD },
-  { text: 'kiểu của', type: Keyword.TYPEOF, boundary: Boundary.WORD },
-  { text: 'switch', type: Keyword.SWITCH, boundary: Boundary.WORD },
-  { text: 'duyệt', type: Keyword.SWITCH, boundary: Boundary.WORD },
-  { text: 'case', type: Keyword.CASE, boundary: Boundary.WORD },
-  { text: 'trường hợp', type: Keyword.CASE, boundary: Boundary.WORD },
-  { text: 'if', type: Keyword.IF, boundary: Boundary.WORD },
-  { text: 'nếu', type: Keyword.IF, boundary: Boundary.WORD },
-  { text: 'else', type: Keyword.ELSE, boundary: Boundary.NONE },
-  { text: 'không thì', type: Keyword.ELSE, boundary: Boundary.NONE },
-  { text: 'new', type: Keyword.NEW, boundary: Boundary.WORD },
-  { text: 'catch', type: Keyword.CATCH, boundary: Boundary.WORD },
-  { text: 'bắt lỗi', type: Keyword.CATCH, boundary: Boundary.WORD },
-  { text: 'finally', type: Keyword.FINALLY, boundary: Boundary.WORD },
-  { text: 'cuối cùng', type: Keyword.FINALLY, boundary: Boundary.WORD },
-  { text: 'return', type: Keyword.RETURN, boundary: Boundary.NONE },
-  { text: 'trả về', type: Keyword.RETURN, boundary: Boundary.NONE },
-  { text: 'void', type: Keyword.VOID, boundary: Boundary.WORD },
-  { text: 'continue', type: Keyword.CONTINUE, boundary: Boundary.WORD },
-  { text: 'tiếp tục', type: Keyword.CONTINUE, boundary: Boundary.WORD },
-  { text: 'for', type: Keyword.FOR, boundary: Boundary.WORD },
-  { text: 'lặp', type: Keyword.FOR, boundary: Boundary.WORD },
-  { text: 'while', type: Keyword.WHILE, boundary: Boundary.WORD },
-  { text: 'khi mà', type: Keyword.WHILE, boundary: Boundary.IDENT },
-  { text: 'debugger', type: Keyword.DEBUGGER, boundary: Boundary.WORD },
-  { text: 'function', type: Keyword.FUNCTION, boundary: Boundary.WORD },
-  { text: 'hàm', type: Keyword.FUNCTION, boundary: Boundary.WORD },
-  { text: 'this', type: Keyword.THIS, boundary: Boundary.WORD },
-  { text: 'đây', type: Keyword.THIS, boundary: Boundary.WORD },
-  { text: 'with', type: Keyword.WITH, boundary: Boundary.WORD },
-  { text: 'default', type: Keyword.DEFAULT, boundary: Boundary.WORD },
-  { text: 'mặc định', type: Keyword.DEFAULT, boundary: Boundary.WORD },
-  { text: 'throw', type: Keyword.THROW, boundary: Boundary.WORD },
-  { text: 'báo lỗi', type: Keyword.THROW, boundary: Boundary.WORD },
-  { text: 'delete', type: Keyword.DELETE, boundary: Boundary.WORD },
-  { text: 'xoá', type: Keyword.DELETE, boundary: Boundary.IDENT },
-  { text: 'in', type: Keyword.IN, boundary: Boundary.WORD },
-  { text: 'trong', type: Keyword.IN, boundary: Boundary.WORD },
-  { text: 'of', type: Keyword.OF, boundary: Boundary.WORD },
-  { text: 'của', type: Keyword.OF, boundary: Boundary.WORD },
-  { text: 'try', type: Keyword.TRY, boundary: Boundary.NONE },
-  { text: 'thử', type: Keyword.TRY, boundary: Boundary.NONE },
-  { text: 'as', type: Keyword.AS, boundary: Boundary.NONE },
-  { text: 'như là', type: Keyword.AS, boundary: Boundary.NONE },
-  { text: 'from', type: Keyword.FROM, boundary: Boundary.NONE },
-  { text: 'từ', type: Keyword.FROM, boundary: Boundary.NONE },
-  { text: 'const', type: Keyword.CONST, boundary: Boundary.NONE },
-  { text: 'hằng số', type: Keyword.CONST, boundary: Boundary.NONE },
-  { text: 'class', type: Keyword.CLASS, boundary: Boundary.WORD },
-  { text: 'lớp', type: Keyword.CLASS, boundary: Boundary.WORD },
-  { text: 'super', type: Keyword.SUPER, boundary: Boundary.WORD },
-  { text: 'khởi tạo cha', type: Keyword.SUPER, boundary: Boundary.WORD },
-  { text: 'constructor', type: Keyword.CONSTRUCTOR, boundary: Boundary.WORD },
-  { text: 'khởi tạo', type: Keyword.CONSTRUCTOR, boundary: Boundary.WORD },
-  { text: 'extends', type: Keyword.EXTENDS, boundary: Boundary.WORD },
-  { text: 'kế thừa', type: Keyword.EXTENDS, boundary: Boundary.WORD },
-  { text: 'export', type: Keyword.EXPORT, boundary: Boundary.WORD },
-  { text: 'cho phép', type: Keyword.EXPORT, boundary: Boundary.WORD },
-  { text: 'import', type: Keyword.IMPORT, boundary: Boundary.WORD },
-  { text: 'sử dụng', type: Keyword.IMPORT, boundary: Boundary.WORD },
-  { text: 'async', type: Keyword.ASYNC, boundary: Boundary.NONE },
-  { text: 'bất đồng bộ', type: Keyword.ASYNC, boundary: Boundary.NONE },
-  { text: 'await', type: Keyword.AWAIT, boundary: Boundary.WORD },
-  { text: 'chờ', type: Keyword.AWAIT, boundary: Boundary.IDENT },
-  { text: 'yield', type: Keyword.YIELD, boundary: Boundary.WORD },
-  { text: 'nhường', type: Keyword.YIELD, boundary: Boundary.WORD },
-  { text: 'let', type: Keyword.LET, boundary: Boundary.WORD },
-  { text: 'biến', type: Keyword.LET, boundary: Boundary.WORD },
-  { text: 'private', type: Keyword.PRIVATE, boundary: Boundary.WORD },
-  { text: 'riêng tư', type: Keyword.PRIVATE, boundary: Boundary.IDENT },
-  { text: 'public', type: Keyword.PUBLIC, boundary: Boundary.WORD },
-  { text: 'công khai', type: Keyword.PUBLIC, boundary: Boundary.WORD },
-  { text: 'protected', type: Keyword.PROTECTED, boundary: Boundary.WORD },
-  { text: 'bảo vệ', type: Keyword.PROTECTED, boundary: Boundary.IDENT },
-  { text: 'static', type: Keyword.STATIC, boundary: Boundary.WORD },
-  { text: 'tĩnh', type: Keyword.STATIC, boundary: Boundary.WORD },
-  { text: 'get', type: Keyword.GET, boundary: Boundary.WORD },
-  { text: 'lấy', type: Keyword.GET, boundary: Boundary.WORD },
-  { text: 'set', type: Keyword.SET, boundary: Boundary.WORD },
-  { text: 'gán', type: Keyword.SET, boundary: Boundary.WORD },
-  { text: 'null', type: Keyword.NULL, boundary: Boundary.WORD },
-  { text: 'rỗng', type: Keyword.NULL, boundary: Boundary.WORD },
-  { text: 'NaN', type: Keyword.NAN, boundary: Boundary.WORD },
-  { text: 'Infinity', type: Keyword.INFINITY, boundary: Boundary.WORD },
-  { text: 'vô cực', type: Keyword.INFINITY, boundary: Boundary.WORD },
-  { text: 'undefined', type: Keyword.UNDEFINED, boundary: Boundary.WORD },
-  { text: 'không xác định', type: Keyword.UNDEFINED, boundary: Boundary.WORD },
-  { text: 'true', type: Keyword.BOOLEAN, boundary: Boundary.WORD },
-  { text: 'false', type: Keyword.BOOLEAN, boundary: Boundary.WORD },
-  { text: 'đúng', type: Keyword.BOOLEAN, boundary: Boundary.WORD },
-  { text: 'sai', type: Keyword.BOOLEAN, boundary: Boundary.WORD },
-]
-
-class TrieNode {
-  children = new Map<number, TrieNode>()
-  type: Keyword | null = null
-  boundary: BoundaryKind = Boundary.WORD
-}
-
-const KEYWORD_TRIE: TrieNode = (() => {
-  const root = new TrieNode()
-  for (const entry of KEYWORDS) {
-    let node = root
-    for (let i = 0; i < entry.text.length; i++) {
-      const code = entry.text.charCodeAt(i)
-      let child = node.children.get(code)
-      if (!child) {
-        child = new TrieNode()
-        node.children.set(code, child)
-      }
-      node = child
-    }
-    if (node.type === null) {
-      node.type = entry.type
-      node.boundary = entry.boundary
-    }
-  }
-  return root
-})()
-
-// Operator longest-match tree. Built from a flat list of operator strings.
-// Order independent — at lookup time we walk character-by-character and keep
-// track of the longest valid operator end seen.
-const OPERATORS: readonly string[] = [
-  '...',
-  '.',
-  '>>>=',
-  '>>>',
-  '>>=',
-  '>>',
-  '<<=',
-  '<<',
-  '<=',
-  '>=',
-  '<',
-  '>',
-  '===',
-  '!==',
-  '==',
-  '!=',
-  '=>',
-  '**=',
-  '**',
-  '*=',
-  '*',
-  '++',
-  '+=',
-  '+',
-  '--',
-  '-=',
-  '-',
-  '/=',
-  '/',
-  '%=',
-  '%',
-  '&&=',
-  '&&',
-  '&=',
-  '&',
-  '||=',
-  '||',
-  '|=',
-  '|',
-  '^=',
-  '^',
-  '~',
-  '!',
-  '??=',
-  '??',
-  '?.',
-  '?',
-  '=',
-  '[',
-  ']',
-  '(',
-  ')',
-  '{',
-  '}',
-  ';',
-  ',',
-  ':',
-  '#',
-]
-
-class OperatorNode {
-  children = new Map<number, OperatorNode>()
-  value: string | null = null
-}
-
-const OPERATOR_TRIE: OperatorNode = (() => {
-  const root = new OperatorNode()
-  for (const op of OPERATORS) {
-    let node = root
-    for (let i = 0; i < op.length; i++) {
-      const code = op.charCodeAt(i)
-      let child = node.children.get(code)
-      if (!child) {
-        child = new OperatorNode()
-        node.children.set(code, child)
-      }
-      node = child
-    }
-    node.value = op
-  }
-  return root
-})()
-
-const REGEX_PRECEDING_TOKENS = new Set<string>([
-  '(',
-  '[',
-  '{',
-  ',',
-  ';',
-  ':',
-  '=',
-  '!',
-  '?',
-  '+',
-  '-',
-  '*',
-  '/',
-  '%',
-  '&&',
-  '||',
-  '??',
-  '=>',
-  '==',
-  '===',
-  '!=',
-  '!==',
-  '<',
-  '>',
-  '<=',
-  '>=',
-  '&',
-  '|',
-  '^',
-  '~',
-  '<<',
-  '>>',
-  '>>>',
-  '+=',
-  '-=',
-  '*=',
-  '/=',
-  '%=',
-  '**=',
-  '&=',
-  '|=',
-  '^=',
-  '<<=',
-  '>>=',
-  '>>>=',
-  '&&=',
-  '||=',
-  '??=',
-  '...',
-  Keyword.RETURN,
-  Keyword.YIELD,
-  Keyword.AWAIT,
-  Keyword.TYPEOF,
-  Keyword.VOID,
-  Keyword.DELETE,
-  Keyword.NEW,
-  Keyword.THROW,
-  Keyword.IN,
-  Keyword.OF,
-  Keyword.INSTANCEOF,
-  Keyword.CASE,
-  Keyword.DEFAULT,
-])
-
-function isWhitespace(code: number): boolean {
-  // Mirrors JS \s minus what we don't expect in source. We rely on String.prototype
-  // to be lenient — anything not handled below falls through to the operator
-  // dispatcher and throws cleanly.
-  return code === 0x20 /* space */
-    || code === 0x09 /* tab */
-    || code === 0x0A /* LF */
-    || code === 0x0D /* CR */
-    || code === 0x0B /* VT */
-    || code === 0x0C /* FF */
-    || code === 0xA0 /* NBSP */
-}
-
-function isAsciiLetter(code: number): boolean {
-  return (code >= 0x41 && code <= 0x5A) || (code >= 0x61 && code <= 0x7A)
-}
-
-function isVietnameseLetter(code: number): boolean {
-  return code >= 0x00C0 && code <= 0x1EF9
-}
-
-function isIdentStart(code: number): boolean {
-  return isAsciiLetter(code) || isVietnameseLetter(code)
-}
-
-function isDigit(code: number): boolean {
-  return code >= 0x30 && code <= 0x39
-}
-
-function isIdentCont(code: number): boolean {
-  return isIdentStart(code) || isDigit(code)
-}
-
-function isHexDigit(code: number): boolean {
-  return isDigit(code)
-    || (code >= 0x41 && code <= 0x46)
-    || (code >= 0x61 && code <= 0x66)
-}
-
-function isOctalDigit(code: number): boolean {
-  return code >= 0x30 && code <= 0x37
-}
-
-function isBinaryDigit(code: number): boolean {
-  return code === 0x30 || code === 0x31
-}
-
-function isAsciiWordChar(code: number): boolean {
-  return isAsciiLetter(code) || isDigit(code) || code === 0x5F /* _ */
-}
-
-export class TokenizerFSM {
-  private parser: Parser
-
-  private cursor: number
-
-  private lastTokenType: string | null = null
-
-  constructor(parser: Parser) {
-    this.parser = parser
-    this.cursor = 0
-  }
-
-  public getCursor(): number {
-    return this.cursor
-  }
-
-  public rollback(step: number): number {
-    if (this.parser.lookahead)
-      this.parser.lookahead.end -= step
-    this.cursor -= step
-    return this.cursor
-  }
-
-  public isEOF(): boolean {
-    return this.cursor === this.parser.syntax.length
-  }
-
-  protected hasMoreTokens(): boolean {
-    return this.cursor < this.parser.syntax.length
-  }
-
-  public getNextToken(): Token | null {
-    const source = this.parser.syntax
-    const length = source.length
-
-    while (this.cursor < length) {
-      const start = this.cursor
-      const code = source.charCodeAt(this.cursor)
-
-      if (isWhitespace(code)) {
-        this.cursor++
-        continue
-      }
-
-      // Line comment: //...
-      if (code === 0x2F && source.charCodeAt(this.cursor + 1) === 0x2F) {
-        this.cursor += 2
-        while (this.cursor < length && source.charCodeAt(this.cursor) !== 0x0A) {
-          this.cursor++
-        }
-        continue
-      }
-
-      // Block comment: /* ... */
-      if (code === 0x2F && source.charCodeAt(this.cursor + 1) === 0x2A) {
-        this.cursor += 2
-        while (this.cursor < length) {
-          if (source.charCodeAt(this.cursor) === 0x2A
-            && source.charCodeAt(this.cursor + 1) === 0x2F) {
-            this.cursor += 2
-            break
-          }
-          this.cursor++
-        }
-        continue
-      }
-
-      // Template literal
-      if (code === 0x60) {
-        return this.scanTemplateLiteral(start)
-      }
-
-      // Regex literal (context-sensitive)
-      if (code === 0x2F && this.isRegexExpected()) {
-        const tok = this.scanRegexLiteral(start)
-        if (tok !== null) {
-          this.lastTokenType = tok.type as string
-          return tok
-        }
-      }
-
-      // String literals
-      if (code === 0x22 || code === 0x27) {
-        return this.scanString(start, code)
-      }
-
-      // Numeric literals: digits, or `.` followed by digit
-      if (isDigit(code)) {
-        return this.scanNumber(start)
-      }
-      if (code === 0x2E /* . */ && isDigit(source.charCodeAt(this.cursor + 1))) {
-        return this.scanNumber(start)
-      }
-
-      // Identifier / keyword
-      if (isIdentStart(code)) {
-        return this.scanIdentifierOrKeyword(start)
-      }
-
-      // Operator (longest match via trie)
-      const opTok = this.scanOperator(start)
-      if (opTok !== null) {
-        return opTok
-      }
-
-      throw new SyntaxError(`Unexpected token: "${source[this.cursor]}"`)
-    }
-
-    return null
-  }
-
-  private isRegexExpected(): boolean {
-    if (this.lastTokenType === null)
-      return true
-    return REGEX_PRECEDING_TOKENS.has(this.lastTokenType)
-  }
-
-  private scanString(start: number, quote: number): Token {
-    const source = this.parser.syntax
-    const length = source.length
-    let i = start + 1
-    while (i < length) {
-      const ch = source.charCodeAt(i)
-      if (ch === 0x5C /* \ */) {
-        i += 2
-        continue
-      }
-      if (ch === quote) {
-        i++
-        const value = source.slice(start, i)
-        this.cursor = i
-        this.lastTokenType = Keyword.STRING
-        return {
-          type: Keyword.STRING,
-          value,
-          start,
-          end: i,
-        }
-      }
-      i++
-    }
-    throw new SyntaxError(`Unterminated string literal at ${start}`)
-  }
-
-  private scanNumber(start: number): Token {
-    const source = this.parser.syntax
-    const length = source.length
-    let i = start
-
-    // Leading dot decimal: .5, .5e2, .5n (n probably nonsense, but match regex)
-    if (source.charCodeAt(i) === 0x2E /* . */) {
-      i++
-      while (i < length) {
-        const c = source.charCodeAt(i)
-        if (isDigit(c) || c === 0x5F) {
-          i++
-          continue
-        }
-        break
-      }
-      i = this.consumeExponent(i)
-      i = this.consumeBigIntSuffix(i)
-      return this.emitNumber(start, i)
-    }
-
-    // 0x / 0o / 0b
-    if (source.charCodeAt(i) === 0x30) {
-      const next = source.charCodeAt(i + 1)
-      if (next === 0x78 || next === 0x58) {
-        i += 2
-        while (i < length) {
-          const c = source.charCodeAt(i)
-          if (isHexDigit(c) || c === 0x5F) {
-            i++
-            continue
-          }
-          break
-        }
-        i = this.consumeBigIntSuffix(i)
-        return this.emitNumber(start, i)
-      }
-      if (next === 0x6F || next === 0x4F) {
-        i += 2
-        while (i < length) {
-          const c = source.charCodeAt(i)
-          if (isOctalDigit(c) || c === 0x5F) {
-            i++
-            continue
-          }
-          break
-        }
-        i = this.consumeBigIntSuffix(i)
-        return this.emitNumber(start, i)
-      }
-      if (next === 0x62 || next === 0x42) {
-        i += 2
-        while (i < length) {
-          const c = source.charCodeAt(i)
-          if (isBinaryDigit(c) || c === 0x5F) {
-            i++
-            continue
-          }
-          break
-        }
-        i = this.consumeBigIntSuffix(i)
-        return this.emitNumber(start, i)
-      }
-    }
-
-    // Decimal: digits[_digits]*[.digits[_digits]*]?
-    while (i < length) {
-      const c = source.charCodeAt(i)
-      if (isDigit(c) || c === 0x5F) {
-        i++
-        continue
-      }
-      break
-    }
-    if (source.charCodeAt(i) === 0x2E /* . */) {
-      i++
-      while (i < length) {
-        const c = source.charCodeAt(i)
-        if (isDigit(c) || c === 0x5F) {
-          i++
-          continue
-        }
-        break
-      }
-    }
-    i = this.consumeExponent(i)
-    i = this.consumeBigIntSuffix(i)
-    return this.emitNumber(start, i)
-  }
-
-  private consumeExponent(i: number): number {
-    const source = this.parser.syntax
-    const length = source.length
-    const c = source.charCodeAt(i)
-    if (c !== 0x65 && c !== 0x45 /* e/E */) {
-      return i
-    }
-    let j = i + 1
-    const sign = source.charCodeAt(j)
-    if (sign === 0x2B || sign === 0x2D /* + or - */) {
-      j++
-    }
-    if (!isDigit(source.charCodeAt(j))) {
-      return i
-    }
-    j++
-    while (j < length) {
-      const cc = source.charCodeAt(j)
-      if (isDigit(cc) || cc === 0x5F) {
-        j++
-        continue
-      }
-      break
-    }
-    return j
-  }
-
-  private consumeBigIntSuffix(i: number): number {
-    if (this.parser.syntax.charCodeAt(i) === 0x6E /* n */) {
-      return i + 1
-    }
-    return i
-  }
-
-  private emitNumber(start: number, end: number): Token {
-    const value = this.parser.syntax.slice(start, end)
-    this.cursor = end
-    this.lastTokenType = Keyword.NUMBER
-    return {
-      type: Keyword.NUMBER,
-      value,
-      start,
-      end,
-    }
-  }
-
-  private scanIdentifierOrKeyword(start: number): Token {
-    // Try keyword trie first; on a hit with passing boundary we emit the keyword.
-    // Otherwise we fall back to identifier scanning (with multi-word support
-    // and embedded-keyword truncation).
-    const kw = this.matchKeyword(start)
-    if (kw !== null) {
-      this.cursor = kw.end
-      this.lastTokenType = kw.type as string
-      return {
-        type: kw.type,
-        value: this.parser.syntax.slice(start, kw.end),
-        start,
-        end: kw.end,
-      }
-    }
-    return this.scanIdentifier(start)
-  }
-
-  private matchKeyword(start: number): { type: Keyword, end: number } | null {
-    const source = this.parser.syntax
-    const length = source.length
-    let node: TrieNode | undefined = KEYWORD_TRIE
-    let i = start
-    let bestType: Keyword | null = null
-    let bestEnd = -1
-
-    while (i < length && node !== undefined) {
-      const code = source.charCodeAt(i)
-      const next = node.children.get(code)
-      if (next === undefined)
-        break
-      i++
-      node = next
-      if (node.type !== null && this.boundaryOk(node.boundary, i)) {
-        bestType = node.type
-        bestEnd = i
-      }
-    }
-
-    if (bestType !== null && bestEnd !== -1) {
-      return { type: bestType, end: bestEnd }
-    }
-    return null
-  }
-
-  private boundaryOk(kind: BoundaryKind, end: number): boolean {
-    if (kind === Boundary.NONE)
-      return true
-    if (end >= this.parser.syntax.length)
-      return true
-    const code = this.parser.syntax.charCodeAt(end)
-    if (kind === Boundary.WORD) {
-      // Mimic JS \b after ASCII word char: next must not be [A-Za-z0-9_]
-      return !isAsciiWordChar(code)
-    }
-    // IDENT: next must not be [A-Za-zÀ-ỹ]
-    return !isIdentStart(code)
-  }
-
-  private scanIdentifier(start: number): Token {
-    const source = this.parser.syntax
-    const length = source.length
-    let i = start
-    if (!isIdentStart(source.charCodeAt(i))) {
-      throw new SyntaxError(`Unexpected token: "${source[i]}"`)
-    }
-    i++
-    while (i < length && isIdentCont(source.charCodeAt(i))) {
-      i++
-    }
-
-    // Multi-word identifier: consume ` <word>` repeatedly, but stop before
-    // a word that would itself start a keyword at this position.
-    while (i < length) {
-      if (source.charCodeAt(i) !== 0x20 /* space */)
-        break
-      const wordStart = i + 1
-      if (wordStart >= length)
-        break
-      const wordCode = source.charCodeAt(wordStart)
-      if (!isIdentStart(wordCode))
-        break
-      // Embedded-keyword check: starting from wordStart, would a keyword
-      // match? If yes, do NOT consume the space, terminate identifier here.
-      if (this.matchKeyword(wordStart) !== null) {
-        break
-      }
-      i = wordStart + 1
-      while (i < length && isIdentCont(source.charCodeAt(i))) {
-        i++
-      }
-    }
-
-    const value = source.slice(start, i)
-    this.cursor = i
-    this.lastTokenType = Keyword.IDENTIFIER
-    return {
-      type: Keyword.IDENTIFIER,
-      value,
-      start,
-      end: i,
-    }
-  }
-
-  private scanOperator(start: number): Token | null {
-    const source = this.parser.syntax
-    const length = source.length
-    let node: OperatorNode | undefined = OPERATOR_TRIE
-    let i = start
-    let bestEnd = -1
-    let bestValue: string | null = null
-
-    while (i < length && node !== undefined) {
-      const code = source.charCodeAt(i)
-      const next = node.children.get(code)
-      if (next === undefined)
-        break
-      i++
-      node = next
-      if (node.value !== null) {
-        bestEnd = i
-        bestValue = node.value
-      }
-    }
-
-    if (bestValue === null || bestEnd === -1) {
-      return null
-    }
-    this.cursor = bestEnd
-    this.lastTokenType = bestValue
-    return {
-      type: bestValue,
-      value: bestValue,
-      start,
-      end: bestEnd,
-    }
-  }
-
-  private scanRegexLiteral(start: number): Token | null {
-    const source = this.parser.syntax
-    const length = source.length
-    let i = start + 1
-    let inCharClass = false
-
-    while (i < length) {
-      const ch = source.charCodeAt(i)
-      if (ch === 0x5C /* \ */) {
-        i += 2
-        continue
-      }
-      if (ch === 0x5B /* [ */) {
-        inCharClass = true
-        i++
-        continue
-      }
-      if (ch === 0x5D /* ] */) {
-        inCharClass = false
-        i++
-        continue
-      }
-      if (ch === 0x2F /* / */ && !inCharClass) {
-        i++
-        while (i < length) {
-          const fc = source.charCodeAt(i)
-          if (fc >= 0x61 && fc <= 0x7A) {
-            i++
-            continue
-          }
-          break
-        }
-        const value = source.slice(start, i)
-        this.cursor = i
-        return {
-          type: 'RegExpLiteral',
-          value,
-          start,
-          end: i,
-        }
-      }
-      if (ch === 0x0A /* \n */) {
-        return null
-      }
-      i++
-    }
-    return null
-  }
-
-  private scanTemplateLiteral(start: number): Token {
-    const source = this.parser.syntax
-    const length = source.length
-    let i = start + 1
-
-    while (i < length) {
-      const ch = source.charCodeAt(i)
-
-      if (ch === 0x5C /* \ */) {
-        i += 2
-        continue
-      }
-
-      if (ch === 0x60 /* ` */) {
-        i++
-        const value = source.slice(start, i)
-        this.cursor = i
-        this.lastTokenType = 'TemplateLiteral'
-        return {
-          type: 'TemplateLiteral',
-          value,
-          start,
-          end: i,
-        }
-      }
-
-      if (ch === 0x24 /* $ */ && source.charCodeAt(i + 1) === 0x7B /* { */) {
-        i += 2
-        let depth = 1
-        while (i < length && depth > 0) {
-          const inner = source.charCodeAt(i)
-          if (inner === 0x5C) {
-            i += 2
-            continue
-          }
-          if (inner === 0x22 || inner === 0x27) {
-            const quote = inner
-            i++
-            while (i < length && source.charCodeAt(i) !== quote) {
-              if (source.charCodeAt(i) === 0x5C)
-                i++
-              i++
-            }
-            i++
-            continue
-          }
-          if (inner === 0x60 /* ` */) {
-            i++
-            while (i < length) {
-              if (source.charCodeAt(i) === 0x5C) {
-                i += 2
-                continue
-              }
-              if (source.charCodeAt(i) === 0x24
-                && source.charCodeAt(i + 1) === 0x7B) {
-                i += 2
-                let innerDepth = 1
-                while (i < length && innerDepth > 0) {
-                  const ic = source.charCodeAt(i)
-                  if (ic === 0x7B)
-                    innerDepth++
-                  else if (ic === 0x7D)
-                    innerDepth--
-                  i++
-                }
-                continue
-              }
-              if (source.charCodeAt(i) === 0x60) {
-                i++
-                break
-              }
-              i++
-            }
-            continue
-          }
-          if (inner === 0x7B)
-            depth++
-          else if (inner === 0x7D)
-            depth--
-          i++
-        }
-        continue
-      }
-
-      i++
-    }
-
-    throw new SyntaxError(`Template literal không đóng, bắt đầu tại vị trí ${start}`)
-  }
-}
diff --git a/packages/parser/src/tokenizer.ts b/packages/parser/src/tokenizer.ts
index fe1c5a8..de3ff4c 100644
--- a/packages/parser/src/tokenizer.ts
+++ b/packages/parser/src/tokenizer.ts
@@ -2,8 +2,247 @@ import type { Token } from '@vietscript/shared'
 import type { Parser } from './parser'
 
 import { Keyword } from '@vietscript/shared'
-import { Specs } from './constants/specs'
 
+// Boundary kinds (post-keyword check):
+//   WORD: next char must not be in [A-Za-z0-9_] (mimics JS \b after ASCII word).
+//   IDENT: next char must not be in [A-Za-zÀ-ỹ] (used for VI keywords ending
+//          in non-ASCII, where \b doesn't apply).
+//   NONE: no boundary check (for keywords like `else`, `return`, `try`, `as`,
+//         `from`, `const`, `async` that may abut other identifiers).
+const Boundary = {
+  WORD: 0,
+  IDENT: 1,
+  NONE: 2,
+} as const
+
+type BoundaryKind = typeof Boundary[keyof typeof Boundary]
+
+interface KeywordEntry {
+  text: string
+  type: Keyword
+  boundary: BoundaryKind
+}
+
+// Source of truth for every keyword (English + Vietnamese aliases). The trie
+// below is built from this list; ordering does not matter — `matchKeyword`
+// returns the longest valid match with a passing boundary.
+const KEYWORDS: ReadonlyArray<KeywordEntry> = [
+  { text: 'var', type: Keyword.VAR, boundary: Boundary.WORD },
+  { text: 'khai báo', type: Keyword.VAR, boundary: Boundary.WORD },
+  { text: 'break', type: Keyword.BREAK, boundary: Boundary.WORD },
+  { text: 'phá vòng lặp', type: Keyword.BREAK, boundary: Boundary.WORD },
+  { text: 'do', type: Keyword.DO, boundary: Boundary.WORD },
+  { text: 'thực hiện', type: Keyword.DO, boundary: Boundary.WORD },
+  { text: 'instanceof', type: Keyword.INSTANCEOF, boundary: Boundary.WORD },
+  { text: 'là kiểu', type: Keyword.INSTANCEOF, boundary: Boundary.WORD },
+  { text: 'typeof', type: Keyword.TYPEOF, boundary: Boundary.WORD },
+  { text: 'kiểu của', type: Keyword.TYPEOF, boundary: Boundary.WORD },
+  { text: 'switch', type: Keyword.SWITCH, boundary: Boundary.WORD },
+  { text: 'duyệt', type: Keyword.SWITCH, boundary: Boundary.WORD },
+  { text: 'case', type: Keyword.CASE, boundary: Boundary.WORD },
+  { text: 'trường hợp', type: Keyword.CASE, boundary: Boundary.WORD },
+  { text: 'if', type: Keyword.IF, boundary: Boundary.WORD },
+  { text: 'nếu', type: Keyword.IF, boundary: Boundary.WORD },
+  { text: 'else', type: Keyword.ELSE, boundary: Boundary.NONE },
+  { text: 'không thì', type: Keyword.ELSE, boundary: Boundary.NONE },
+  { text: 'new', type: Keyword.NEW, boundary: Boundary.WORD },
+  { text: 'catch', type: Keyword.CATCH, boundary: Boundary.WORD },
+  { text: 'bắt lỗi', type: Keyword.CATCH, boundary: Boundary.WORD },
+  { text: 'finally', type: Keyword.FINALLY, boundary: Boundary.WORD },
+  { text: 'cuối cùng', type: Keyword.FINALLY, boundary: Boundary.WORD },
+  { text: 'return', type: Keyword.RETURN, boundary: Boundary.NONE },
+  { text: 'trả về', type: Keyword.RETURN, boundary: Boundary.NONE },
+  { text: 'void', type: Keyword.VOID, boundary: Boundary.WORD },
+  { text: 'continue', type: Keyword.CONTINUE, boundary: Boundary.WORD },
+  { text: 'tiếp tục', type: Keyword.CONTINUE, boundary: Boundary.WORD },
+  { text: 'for', type: Keyword.FOR, boundary: Boundary.WORD },
+  { text: 'lặp', type: Keyword.FOR, boundary: Boundary.WORD },
+  { text: 'while', type: Keyword.WHILE, boundary: Boundary.WORD },
+  { text: 'khi mà', type: Keyword.WHILE, boundary: Boundary.IDENT },
+  { text: 'debugger', type: Keyword.DEBUGGER, boundary: Boundary.WORD },
+  { text: 'function', type: Keyword.FUNCTION, boundary: Boundary.WORD },
+  { text: 'hàm', type: Keyword.FUNCTION, boundary: Boundary.WORD },
+  { text: 'this', type: Keyword.THIS, boundary: Boundary.WORD },
+  { text: 'đây', type: Keyword.THIS, boundary: Boundary.WORD },
+  { text: 'with', type: Keyword.WITH, boundary: Boundary.WORD },
+  { text: 'default', type: Keyword.DEFAULT, boundary: Boundary.WORD },
+  { text: 'mặc định', type: Keyword.DEFAULT, boundary: Boundary.WORD },
+  { text: 'throw', type: Keyword.THROW, boundary: Boundary.WORD },
+  { text: 'báo lỗi', type: Keyword.THROW, boundary: Boundary.WORD },
+  { text: 'delete', type: Keyword.DELETE, boundary: Boundary.WORD },
+  { text: 'xoá', type: Keyword.DELETE, boundary: Boundary.IDENT },
+  { text: 'in', type: Keyword.IN, boundary: Boundary.WORD },
+  { text: 'trong', type: Keyword.IN, boundary: Boundary.WORD },
+  { text: 'of', type: Keyword.OF, boundary: Boundary.WORD },
+  { text: 'của', type: Keyword.OF, boundary: Boundary.WORD },
+  { text: 'try', type: Keyword.TRY, boundary: Boundary.NONE },
+  { text: 'thử', type: Keyword.TRY, boundary: Boundary.NONE },
+  { text: 'as', type: Keyword.AS, boundary: Boundary.NONE },
+  { text: 'như là', type: Keyword.AS, boundary: Boundary.NONE },
+  { text: 'from', type: Keyword.FROM, boundary: Boundary.NONE },
+  { text: 'từ', type: Keyword.FROM, boundary: Boundary.NONE },
+  { text: 'const', type: Keyword.CONST, boundary: Boundary.NONE },
+  { text: 'hằng số', type: Keyword.CONST, boundary: Boundary.NONE },
+  { text: 'class', type: Keyword.CLASS, boundary: Boundary.WORD },
+  { text: 'lớp', type: Keyword.CLASS, boundary: Boundary.WORD },
+  { text: 'super', type: Keyword.SUPER, boundary: Boundary.WORD },
+  { text: 'khởi tạo cha', type: Keyword.SUPER, boundary: Boundary.WORD },
+  { text: 'constructor', type: Keyword.CONSTRUCTOR, boundary: Boundary.WORD },
+  { text: 'khởi tạo', type: Keyword.CONSTRUCTOR, boundary: Boundary.WORD },
+  { text: 'extends', type: Keyword.EXTENDS, boundary: Boundary.WORD },
+  { text: 'kế thừa', type: Keyword.EXTENDS, boundary: Boundary.WORD },
+  { text: 'export', type: Keyword.EXPORT, boundary: Boundary.WORD },
+  { text: 'cho phép', type: Keyword.EXPORT, boundary: Boundary.WORD },
+  { text: 'import', type: Keyword.IMPORT, boundary: Boundary.WORD },
+  { text: 'sử dụng', type: Keyword.IMPORT, boundary: Boundary.WORD },
+  { text: 'async', type: Keyword.ASYNC, boundary: Boundary.NONE },
+  { text: 'bất đồng bộ', type: Keyword.ASYNC, boundary: Boundary.NONE },
+  { text: 'await', type: Keyword.AWAIT, boundary: Boundary.WORD },
+  { text: 'chờ', type: Keyword.AWAIT, boundary: Boundary.IDENT },
+  { text: 'yield', type: Keyword.YIELD, boundary: Boundary.WORD },
+  { text: 'nhường', type: Keyword.YIELD, boundary: Boundary.WORD },
+  { text: 'let', type: Keyword.LET, boundary: Boundary.WORD },
+  { text: 'biến', type: Keyword.LET, boundary: Boundary.WORD },
+  { text: 'private', type: Keyword.PRIVATE, boundary: Boundary.WORD },
+  { text: 'riêng tư', type: Keyword.PRIVATE, boundary: Boundary.IDENT },
+  { text: 'public', type: Keyword.PUBLIC, boundary: Boundary.WORD },
+  { text: 'công khai', type: Keyword.PUBLIC, boundary: Boundary.WORD },
+  { text: 'protected', type: Keyword.PROTECTED, boundary: Boundary.WORD },
+  { text: 'bảo vệ', type: Keyword.PROTECTED, boundary: Boundary.IDENT },
+  { text: 'static', type: Keyword.STATIC, boundary: Boundary.WORD },
+  { text: 'tĩnh', type: Keyword.STATIC, boundary: Boundary.WORD },
+  { text: 'get', type: Keyword.GET, boundary: Boundary.WORD },
+  { text: 'lấy', type: Keyword.GET, boundary: Boundary.WORD },
+  { text: 'set', type: Keyword.SET, boundary: Boundary.WORD },
+  { text: 'gán', type: Keyword.SET, boundary: Boundary.WORD },
+  { text: 'null', type: Keyword.NULL, boundary: Boundary.WORD },
+  { text: 'rỗng', type: Keyword.NULL, boundary: Boundary.WORD },
+  { text: 'NaN', type: Keyword.NAN, boundary: Boundary.WORD },
+  { text: 'Infinity', type: Keyword.INFINITY, boundary: Boundary.WORD },
+  { text: 'vô cực', type: Keyword.INFINITY, boundary: Boundary.WORD },
+  { text: 'undefined', type: Keyword.UNDEFINED, boundary: Boundary.WORD },
+  { text: 'không xác định', type: Keyword.UNDEFINED, boundary: Boundary.WORD },
+  { text: 'true', type: Keyword.BOOLEAN, boundary: Boundary.WORD },
+  { text: 'false', type: Keyword.BOOLEAN, boundary: Boundary.WORD },
+  { text: 'đúng', type: Keyword.BOOLEAN, boundary: Boundary.WORD },
+  { text: 'sai', type: Keyword.BOOLEAN, boundary: Boundary.WORD },
+]
+
+class TrieNode {
+  children = new Map<number, TrieNode>()
+  type: Keyword | null = null
+  boundary: BoundaryKind = Boundary.WORD
+}
+
+const KEYWORD_TRIE: TrieNode = (() => {
+  const root = new TrieNode()
+  for (const entry of KEYWORDS) {
+    let node = root
+    for (let i = 0; i < entry.text.length; i++) {
+      const code = entry.text.charCodeAt(i)
+      let child = node.children.get(code)
+      if (!child) {
+        child = new TrieNode()
+        node.children.set(code, child)
+      }
+      node = child
+    }
+    if (node.type === null) {
+      node.type = entry.type
+      node.boundary = entry.boundary
+    }
+  }
+  return root
+})()
+
+// Operator longest-match trie. Walking char-by-char, we keep the deepest
+// node that is itself a valid operator end.
+const OPERATORS: readonly string[] = [
+  '...',
+  '.',
+  '>>>=',
+  '>>>',
+  '>>=',
+  '>>',
+  '<<=',
+  '<<',
+  '<=',
+  '>=',
+  '<',
+  '>',
+  '===',
+  '!==',
+  '==',
+  '!=',
+  '=>',
+  '**=',
+  '**',
+  '*=',
+  '*',
+  '++',
+  '+=',
+  '+',
+  '--',
+  '-=',
+  '-',
+  '/=',
+  '/',
+  '%=',
+  '%',
+  '&&=',
+  '&&',
+  '&=',
+  '&',
+  '||=',
+  '||',
+  '|=',
+  '|',
+  '^=',
+  '^',
+  '~',
+  '!',
+  '??=',
+  '??',
+  '?.',
+  '?',
+  '=',
+  '[',
+  ']',
+  '(',
+  ')',
+  '{',
+  '}',
+  ';',
+  ',',
+  ':',
+  '#',
+]
+
+class OperatorNode {
+  children = new Map<number, OperatorNode>()
+  value: string | null = null
+}
+
+const OPERATOR_TRIE: OperatorNode = (() => {
+  const root = new OperatorNode()
+  for (const op of OPERATORS) {
+    let node = root
+    for (let i = 0; i < op.length; i++) {
+      const code = op.charCodeAt(i)
+      let child = node.children.get(code)
+      if (!child) {
+        child = new OperatorNode()
+        node.children.set(code, child)
+      }
+      node = child
+    }
+    node.value = op
+  }
+  return root
+})()
+
+// Tokens after which a `/` should be parsed as a regex literal start
+// rather than a division operator.
 const REGEX_PRECEDING_TOKENS = new Set<string>([
   '(',
   '[',
@@ -69,6 +308,54 @@ const REGEX_PRECEDING_TOKENS = new Set<string>([
   Keyword.DEFAULT,
 ])
 
+function isWhitespace(code: number): boolean {
+  return code === 0x20 /* space */
+    || code === 0x09 /* tab */
+    || code === 0x0A /* LF */
+    || code === 0x0D /* CR */
+    || code === 0x0B /* VT */
+    || code === 0x0C /* FF */
+    || code === 0xA0 /* NBSP */
+}
+
+function isAsciiLetter(code: number): boolean {
+  return (code >= 0x41 && code <= 0x5A) || (code >= 0x61 && code <= 0x7A)
+}
+
+function isVietnameseLetter(code: number): boolean {
+  return code >= 0x00C0 && code <= 0x1EF9
+}
+
+function isIdentStart(code: number): boolean {
+  return isAsciiLetter(code) || isVietnameseLetter(code)
+}
+
+function isDigit(code: number): boolean {
+  return code >= 0x30 && code <= 0x39
+}
+
+function isIdentCont(code: number): boolean {
+  return isIdentStart(code) || isDigit(code)
+}
+
+function isHexDigit(code: number): boolean {
+  return isDigit(code)
+    || (code >= 0x41 && code <= 0x46)
+    || (code >= 0x61 && code <= 0x66)
+}
+
+function isOctalDigit(code: number): boolean {
+  return code >= 0x30 && code <= 0x37
+}
+
+function isBinaryDigit(code: number): boolean {
+  return code === 0x30 || code === 0x31
+}
+
+function isAsciiWordChar(code: number): boolean {
+  return isAsciiLetter(code) || isDigit(code) || code === 0x5F /* _ */
+}
+
 export class Tokenizer {
   private parser: Parser
 
@@ -81,12 +368,14 @@ export class Tokenizer {
     this.cursor = 0
   }
 
+  public getCursor(): number {
+    return this.cursor
+  }
+
   public rollback(step: number): number {
     if (this.parser.lookahead)
       this.parser.lookahead.end -= step
-
     this.cursor -= step
-
     return this.cursor
   }
 
@@ -99,67 +388,83 @@ export class Tokenizer {
   }
 
   public getNextToken(): Token | null {
-    if (!this.hasMoreTokens()) {
-      return null
-    }
-
-    const whitespaceMatch = /^\s+/.exec(this.parser.syntax.slice(this.cursor))
-    if (whitespaceMatch) {
-      this.cursor += whitespaceMatch[0].length
-      return this.getNextToken()
-    }
-
-    const string = this.parser.syntax.slice(this.cursor)
+    const source = this.parser.syntax
+    const length = source.length
 
-    if (string[0] === '`') {
-      const tok = this.scanTemplateLiteral()
-      this.lastTokenType = tok.type as string
-      return tok
-    }
+    while (this.cursor < length) {
+      const start = this.cursor
+      const code = source.charCodeAt(this.cursor)
 
-    if (string[0] === '/' && string[1] !== '/' && string[1] !== '*' && this.isRegexExpected()) {
-      const tok = this.scanRegexLiteral()
-      if (tok) {
-        this.lastTokenType = tok.type as string
-        return tok
+      if (isWhitespace(code)) {
+        this.cursor++
+        continue
       }
-    }
 
-    for (const [regexp, tokenType] of Specs) {
-      const tokenValue = this.match(regexp, string)
+      // Line comment: //...
+      if (code === 0x2F && source.charCodeAt(this.cursor + 1) === 0x2F) {
+        this.cursor += 2
+        while (this.cursor < length && source.charCodeAt(this.cursor) !== 0x0A) {
+          this.cursor++
+        }
+        continue
+      }
 
-      if (tokenValue === null) {
+      // Block comment: /* ... */
+      if (code === 0x2F && source.charCodeAt(this.cursor + 1) === 0x2A) {
+        this.cursor += 2
+        while (this.cursor < length) {
+          if (source.charCodeAt(this.cursor) === 0x2A
+            && source.charCodeAt(this.cursor + 1) === 0x2F) {
+            this.cursor += 2
+            break
+          }
+          this.cursor++
+        }
         continue
       }
 
-      if (tokenType === null) {
-        return this.getNextToken()
+      // Template literal
+      if (code === 0x60) {
+        return this.scanTemplateLiteral(start)
       }
 
-      if (tokenType === Keyword.IDENTIFIER && tokenValue.includes(' ')) {
-        const truncated = this.truncateBeforeEmbeddedKeyword(tokenValue)
-        if (truncated.length !== tokenValue.length) {
-          this.cursor -= tokenValue.length - truncated.length
-          this.lastTokenType = tokenType as string
-          return {
-            type: tokenType,
-            value: truncated,
-            start: this.cursor - truncated.length,
-            end: this.cursor,
-          }
+      // Regex literal (context-sensitive)
+      if (code === 0x2F && this.isRegexExpected()) {
+        const tok = this.scanRegexLiteral(start)
+        if (tok !== null) {
+          this.lastTokenType = tok.type as string
+          return tok
         }
       }
 
-      this.lastTokenType = tokenType as string
-      return {
-        type: tokenType,
-        value: tokenValue,
-        start: this.cursor - String(tokenValue).length,
-        end: this.cursor,
+      // String literals
+      if (code === 0x22 || code === 0x27) {
+        return this.scanString(start, code)
+      }
+
+      // Numeric literals: digits, or `.` followed by digit
+      if (isDigit(code)) {
+        return this.scanNumber(start)
+      }
+      if (code === 0x2E /* . */ && isDigit(source.charCodeAt(this.cursor + 1))) {
+        return this.scanNumber(start)
+      }
+
+      // Identifier / keyword
+      if (isIdentStart(code)) {
+        return this.scanIdentifierOrKeyword(start)
       }
+
+      // Operator (longest match via trie)
+      const opTok = this.scanOperator(start)
+      if (opTok !== null) {
+        return opTok
+      }
+
+      throw new SyntaxError(`Unexpected token: "${source[this.cursor]}"`)
     }
 
-    throw new SyntaxError(`Unexpected token: "${string[0]}"`)
+    return null
   }
 
   private isRegexExpected(): boolean {
@@ -168,50 +473,334 @@ export class Tokenizer {
     return REGEX_PRECEDING_TOKENS.has(this.lastTokenType)
   }
 
-  private truncateBeforeEmbeddedKeyword(value: string): string {
-    const words = value.split(/\s+/)
-    let truncated = words[0]
-    for (let i = 1; i < words.length; i++) {
-      const rest = words.slice(i).join(' ')
-      for (const [regexp, tokenType] of Specs) {
-        if (tokenType === null || tokenType === Keyword.IDENTIFIER)
+  private scanString(start: number, quote: number): Token {
+    const source = this.parser.syntax
+    const length = source.length
+    let i = start + 1
+    while (i < length) {
+      const ch = source.charCodeAt(i)
+      if (ch === 0x5C /* \ */) {
+        i += 2
+        continue
+      }
+      if (ch === quote) {
+        i++
+        const value = source.slice(start, i)
+        this.cursor = i
+        this.lastTokenType = Keyword.STRING
+        return {
+          type: Keyword.STRING,
+          value,
+          start,
+          end: i,
+        }
+      }
+      i++
+    }
+    throw new SyntaxError(`Unterminated string literal at ${start}`)
+  }
+
+  private scanNumber(start: number): Token {
+    const source = this.parser.syntax
+    const length = source.length
+    let i = start
+
+    // Leading dot decimal: .5, .5e2
+    if (source.charCodeAt(i) === 0x2E /* . */) {
+      i++
+      while (i < length) {
+        const c = source.charCodeAt(i)
+        if (isDigit(c) || c === 0x5F) {
+          i++
           continue
-        const m = regexp.exec(`${rest};`)
-        if (m && m.index === 0 && /^[A-Za-z\u00C0-\u1EF9]/.test(m[0])) {
-          return truncated
         }
+        break
+      }
+      i = this.consumeExponent(i)
+      i = this.consumeBigIntSuffix(i)
+      return this.emitNumber(start, i)
+    }
+
+    // 0x / 0o / 0b
+    if (source.charCodeAt(i) === 0x30) {
+      const next = source.charCodeAt(i + 1)
+      if (next === 0x78 || next === 0x58) {
+        i += 2
+        while (i < length) {
+          const c = source.charCodeAt(i)
+          if (isHexDigit(c) || c === 0x5F) {
+            i++
+            continue
+          }
+          break
+        }
+        i = this.consumeBigIntSuffix(i)
+        return this.emitNumber(start, i)
+      }
+      if (next === 0x6F || next === 0x4F) {
+        i += 2
+        while (i < length) {
+          const c = source.charCodeAt(i)
+          if (isOctalDigit(c) || c === 0x5F) {
+            i++
+            continue
+          }
+          break
+        }
+        i = this.consumeBigIntSuffix(i)
+        return this.emitNumber(start, i)
+      }
+      if (next === 0x62 || next === 0x42) {
+        i += 2
+        while (i < length) {
+          const c = source.charCodeAt(i)
+          if (isBinaryDigit(c) || c === 0x5F) {
+            i++
+            continue
+          }
+          break
+        }
+        i = this.consumeBigIntSuffix(i)
+        return this.emitNumber(start, i)
+      }
+    }
+
+    // Decimal: digits[_digits]*[.digits[_digits]*]?
+    while (i < length) {
+      const c = source.charCodeAt(i)
+      if (isDigit(c) || c === 0x5F) {
+        i++
+        continue
+      }
+      break
+    }
+    if (source.charCodeAt(i) === 0x2E /* . */) {
+      i++
+      while (i < length) {
+        const c = source.charCodeAt(i)
+        if (isDigit(c) || c === 0x5F) {
+          i++
+          continue
+        }
+        break
+      }
+    }
+    i = this.consumeExponent(i)
+    i = this.consumeBigIntSuffix(i)
+    return this.emitNumber(start, i)
+  }
+
+  private consumeExponent(i: number): number {
+    const source = this.parser.syntax
+    const length = source.length
+    const c = source.charCodeAt(i)
+    if (c !== 0x65 && c !== 0x45 /* e/E */) {
+      return i
+    }
+    let j = i + 1
+    const sign = source.charCodeAt(j)
+    if (sign === 0x2B || sign === 0x2D /* + or - */) {
+      j++
+    }
+    if (!isDigit(source.charCodeAt(j))) {
+      return i
+    }
+    j++
+    while (j < length) {
+      const cc = source.charCodeAt(j)
+      if (isDigit(cc) || cc === 0x5F) {
+        j++
+        continue
       }
-      truncated += ` ${words[i]}`
+      break
     }
-    return value
+    return j
   }
 
-  private scanRegexLiteral(): Token | null {
+  private consumeBigIntSuffix(i: number): number {
+    if (this.parser.syntax.charCodeAt(i) === 0x6E /* n */) {
+      return i + 1
+    }
+    return i
+  }
+
+  private emitNumber(start: number, end: number): Token {
+    const value = this.parser.syntax.slice(start, end)
+    this.cursor = end
+    this.lastTokenType = Keyword.NUMBER
+    return {
+      type: Keyword.NUMBER,
+      value,
+      start,
+      end,
+    }
+  }
+
+  private scanIdentifierOrKeyword(start: number): Token {
+    // Try keyword trie first; on a hit with passing boundary we emit the keyword.
+    // Otherwise fall back to identifier scanning (multi-word with embedded
+    // keyword detection).
+    const kw = this.matchKeyword(start)
+    if (kw !== null) {
+      this.cursor = kw.end
+      this.lastTokenType = kw.type as string
+      return {
+        type: kw.type,
+        value: this.parser.syntax.slice(start, kw.end),
+        start,
+        end: kw.end,
+      }
+    }
+    return this.scanIdentifier(start)
+  }
+
+  private matchKeyword(start: number): { type: Keyword, end: number } | null {
+    const source = this.parser.syntax
+    const length = source.length
+    let node: TrieNode | undefined = KEYWORD_TRIE
+    let i = start
+    let bestType: Keyword | null = null
+    let bestEnd = -1
+
+    while (i < length && node !== undefined) {
+      const code = source.charCodeAt(i)
+      const next = node.children.get(code)
+      if (next === undefined)
+        break
+      i++
+      node = next
+      if (node.type !== null && this.boundaryOk(node.boundary, i)) {
+        bestType = node.type
+        bestEnd = i
+      }
+    }
+
+    if (bestType !== null && bestEnd !== -1) {
+      return { type: bestType, end: bestEnd }
+    }
+    return null
+  }
+
+  private boundaryOk(kind: BoundaryKind, end: number): boolean {
+    if (kind === Boundary.NONE)
+      return true
+    if (end >= this.parser.syntax.length)
+      return true
+    const code = this.parser.syntax.charCodeAt(end)
+    if (kind === Boundary.WORD) {
+      return !isAsciiWordChar(code)
+    }
+    return !isIdentStart(code)
+  }
+
+  private scanIdentifier(start: number): Token {
     const source = this.parser.syntax
-    const start = this.cursor
+    const length = source.length
+    let i = start
+    if (!isIdentStart(source.charCodeAt(i))) {
+      throw new SyntaxError(`Unexpected token: "${source[i]}"`)
+    }
+    i++
+    while (i < length && isIdentCont(source.charCodeAt(i))) {
+      i++
+    }
+
+    // Multi-word identifier: consume ` <word>` repeatedly, but stop before
+    // a word that itself begins a keyword (bounded backtrack via peek).
+    while (i < length) {
+      if (source.charCodeAt(i) !== 0x20 /* space */)
+        break
+      const wordStart = i + 1
+      if (wordStart >= length)
+        break
+      const wordCode = source.charCodeAt(wordStart)
+      if (!isIdentStart(wordCode))
+        break
+      if (this.matchKeyword(wordStart) !== null) {
+        break
+      }
+      i = wordStart + 1
+      while (i < length && isIdentCont(source.charCodeAt(i))) {
+        i++
+      }
+    }
+
+    const value = source.slice(start, i)
+    this.cursor = i
+    this.lastTokenType = Keyword.IDENTIFIER
+    return {
+      type: Keyword.IDENTIFIER,
+      value,
+      start,
+      end: i,
+    }
+  }
+
+  private scanOperator(start: number): Token | null {
+    const source = this.parser.syntax
+    const length = source.length
+    let node: OperatorNode | undefined = OPERATOR_TRIE
+    let i = start
+    let bestEnd = -1
+    let bestValue: string | null = null
+
+    while (i < length && node !== undefined) {
+      const code = source.charCodeAt(i)
+      const next = node.children.get(code)
+      if (next === undefined)
+        break
+      i++
+      node = next
+      if (node.value !== null) {
+        bestEnd = i
+        bestValue = node.value
+      }
+    }
+
+    if (bestValue === null || bestEnd === -1) {
+      return null
+    }
+    this.cursor = bestEnd
+    this.lastTokenType = bestValue
+    return {
+      type: bestValue,
+      value: bestValue,
+      start,
+      end: bestEnd,
+    }
+  }
+
+  private scanRegexLiteral(start: number): Token | null {
+    const source = this.parser.syntax
+    const length = source.length
     let i = start + 1
     let inCharClass = false
 
-    while (i < source.length) {
-      const ch = source[i]
-      if (ch === '\\') {
+    while (i < length) {
+      const ch = source.charCodeAt(i)
+      if (ch === 0x5C /* \ */) {
         i += 2
         continue
       }
-      if (ch === '[') {
+      if (ch === 0x5B /* [ */) {
         inCharClass = true
         i++
         continue
       }
-      if (ch === ']') {
+      if (ch === 0x5D /* ] */) {
         inCharClass = false
         i++
         continue
       }
-      if (ch === '/' && !inCharClass) {
+      if (ch === 0x2F /* / */ && !inCharClass) {
         i++
-        while (i < source.length && /[a-z]/.test(source[i])) {
-          i++
+        while (i < length) {
+          const fc = source.charCodeAt(i)
+          if (fc >= 0x61 && fc <= 0x7A) {
+            i++
+            continue
+          }
+          break
         }
         const value = source.slice(start, i)
         this.cursor = i
@@ -222,32 +811,32 @@ export class Tokenizer {
           end: i,
         }
       }
-      if (ch === '\n') {
+      if (ch === 0x0A /* \n */) {
         return null
       }
       i++
     }
-
     return null
   }
 
-  private scanTemplateLiteral(): Token {
+  private scanTemplateLiteral(start: number): Token {
     const source = this.parser.syntax
-    const start = this.cursor
+    const length = source.length
     let i = start + 1
 
-    while (i < source.length) {
-      const ch = source[i]
+    while (i < length) {
+      const ch = source.charCodeAt(i)
 
-      if (ch === '\\') {
+      if (ch === 0x5C /* \ */) {
         i += 2
         continue
       }
 
-      if (ch === '`') {
+      if (ch === 0x60 /* ` */) {
         i++
         const value = source.slice(start, i)
         this.cursor = i
+        this.lastTokenType = 'TemplateLiteral'
         return {
           type: 'TemplateLiteral',
           value,
@@ -256,46 +845,48 @@ export class Tokenizer {
         }
       }
 
-      if (ch === '$' && source[i + 1] === '{') {
+      if (ch === 0x24 /* $ */ && source.charCodeAt(i + 1) === 0x7B /* { */) {
         i += 2
         let depth = 1
-        while (i < source.length && depth > 0) {
-          const inner = source[i]
-          if (inner === '\\') {
+        while (i < length && depth > 0) {
+          const inner = source.charCodeAt(i)
+          if (inner === 0x5C) {
             i += 2
             continue
           }
-          if (inner === '"' || inner === '\'') {
+          if (inner === 0x22 || inner === 0x27) {
             const quote = inner
             i++
-            while (i < source.length && source[i] !== quote) {
-              if (source[i] === '\\')
+            while (i < length && source.charCodeAt(i) !== quote) {
+              if (source.charCodeAt(i) === 0x5C)
                 i++
               i++
             }
             i++
             continue
           }
-          if (inner === '`') {
+          if (inner === 0x60 /* ` */) {
             i++
-            while (i < source.length) {
-              if (source[i] === '\\') {
+            while (i < length) {
+              if (source.charCodeAt(i) === 0x5C) {
                 i += 2
                 continue
               }
-              if (source[i] === '$' && source[i + 1] === '{') {
+              if (source.charCodeAt(i) === 0x24
+                && source.charCodeAt(i + 1) === 0x7B) {
                 i += 2
                 let innerDepth = 1
-                while (i < source.length && innerDepth > 0) {
-                  if (source[i] === '{')
+                while (i < length && innerDepth > 0) {
+                  const ic = source.charCodeAt(i)
+                  if (ic === 0x7B)
                     innerDepth++
-                  else if (source[i] === '}')
+                  else if (ic === 0x7D)
                     innerDepth--
                   i++
                 }
                 continue
               }
-              if (source[i] === '`') {
+              if (source.charCodeAt(i) === 0x60) {
                 i++
                 break
               }
@@ -303,9 +894,9 @@ export class Tokenizer {
             }
             continue
           }
-          if (inner === '{')
+          if (inner === 0x7B)
             depth++
-          else if (inner === '}')
+          else if (inner === 0x7D)
             depth--
           i++
         }
@@ -317,16 +908,4 @@ export class Tokenizer {
 
     throw new SyntaxError(`Template literal không đóng, bắt đầu tại vị trí ${start}`)
   }
-
-  private match(regexp: RegExp, syntax: string): string | null {
-    const formattedSyntax = syntax.split(';')
-    const matched = regexp.exec(formattedSyntax[0].concat(';'))
-
-    if (matched && matched.index === 0) {
-      this.cursor += matched[0].length
-      return matched[0]
-    }
-
-    return null
-  }
 }
diff --git a/packages/shared/index.ts b/packages/shared/index.ts
index 4f4a72f..396b263 100644
--- a/packages/shared/index.ts
+++ b/packages/shared/index.ts
@@ -1,5 +1,4 @@
 export * from './parser/keyword.enum'
 export * from './parser/node.interface'
 export * from './parser/operator.type'
-export * from './parser/spec.type'
 export * from './parser/token.type'
diff --git a/packages/shared/parser/spec.type.ts b/packages/shared/parser/spec.type.ts
deleted file mode 100644
index 88daf54..0000000
--- a/packages/shared/parser/spec.type.ts
+++ /dev/null
@@ -1,4 +0,0 @@
-import type { Keyword } from './keyword.enum'
-import type { Operator } from './operator.type'
-
-export type Spec = [RegExp, Keyword | Operator | null]