Skip to content

Commit 979419e

Browse files
committed
Fix potential conversion error
And update opencc-data to 1.0.5
1 parent 2cdc350 commit 979419e

4 files changed

Lines changed: 87 additions & 60 deletions

File tree

.github/workflows/build.yml

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,15 @@ jobs:
2525
run: |
2626
build/prepare.sh
2727
python build/main.py
28-
- name: Upload artifact
28+
- name: Upload FanWunMing
2929
uses: actions/upload-artifact@v2
3030
with:
31-
name: Font files
32-
path: output/*.ttf
31+
name: FanWunMing
32+
path: |
33+
output/FanWunMing-*.ttf
34+
!output/FanWunMing-TW-*.ttf
35+
- name: Upload FanWunMing-TW
36+
uses: actions/upload-artifact@v2
37+
with:
38+
name: FanWunMing-TW
39+
path: output/FanWunMing-TW-*.ttf

LICENSE

Lines changed: 25 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,35 +1,28 @@
1-
Copyright 2020 Ayaka Mikazuki (https://ayaka.shn.hk/).
1+
This Font Software is licensed under the SIL Open Font License,
2+
Version 1.1.
23

3-
4-
Copyright 2014-2019 Adobe (http://www.adobe.com/), with Reserved Font
5-
Name 'Source'. Source is a trademark of Adobe in the United States
6-
and/or other countries.
7-
8-
9-
This Font Software is licensed under the SIL Open Font License, Version 1.1.
104
This license is copied below, and is also available with a FAQ at:
115
http://scripts.sil.org/OFL
126

13-
147
-----------------------------------------------------------
158
SIL OPEN FONT LICENSE Version 1.1 - 26 February 2007
169
-----------------------------------------------------------
1710

1811
PREAMBLE
1912
The goals of the Open Font License (OFL) are to stimulate worldwide
20-
development of collaborative font projects, to support the font creation
21-
efforts of academic and linguistic communities, and to provide a free and
22-
open framework in which fonts may be shared and improved in partnership
23-
with others.
13+
development of collaborative font projects, to support the font
14+
creation efforts of academic and linguistic communities, and to
15+
provide a free and open framework in which fonts may be shared and
16+
improved in partnership with others.
2417

2518
The OFL allows the licensed fonts to be used, studied, modified and
2619
redistributed freely as long as they are not sold by themselves. The
27-
fonts, including any derivative works, can be bundled, embedded,
20+
fonts, including any derivative works, can be bundled, embedded,
2821
redistributed and/or sold with any software provided that any reserved
2922
names are not used by derivative works. The fonts and derivatives,
3023
however, cannot be released under any other type of license. The
31-
requirement for fonts to remain under this license does not apply
32-
to any document created using the fonts or their derivatives.
24+
requirement for fonts to remain under this license does not apply to
25+
any document created using the fonts or their derivatives.
3326

3427
DEFINITIONS
3528
"Font Software" refers to the set of files released by the Copyright
@@ -39,25 +32,25 @@ include source files, build scripts and documentation.
3932
"Reserved Font Name" refers to any names specified as such after the
4033
copyright statement(s).
4134

42-
"Original Version" refers to the collection of Font Software components as
43-
distributed by the Copyright Holder(s).
35+
"Original Version" refers to the collection of Font Software
36+
components as distributed by the Copyright Holder(s).
4437

45-
"Modified Version" refers to any derivative made by adding to, deleting,
46-
or substituting -- in part or in whole -- any of the components of the
47-
Original Version, by changing formats or by porting the Font Software to a
48-
new environment.
38+
"Modified Version" refers to any derivative made by adding to,
39+
deleting, or substituting -- in part or in whole -- any of the
40+
components of the Original Version, by changing formats or by porting
41+
the Font Software to a new environment.
4942

5043
"Author" refers to any designer, engineer, programmer, technical
5144
writer or other person who contributed to the Font Software.
5245

5346
PERMISSION & CONDITIONS
5447
Permission is hereby granted, free of charge, to any person obtaining
55-
a copy of the Font Software, to use, study, copy, merge, embed, modify,
56-
redistribute, and sell modified and unmodified copies of the Font
57-
Software, subject to the following conditions:
48+
a copy of the Font Software, to use, study, copy, merge, embed,
49+
modify, redistribute, and sell modified and unmodified copies of the
50+
Font Software, subject to the following conditions:
5851

59-
1) Neither the Font Software nor any of its individual components,
60-
in Original or Modified Versions, may be sold by itself.
52+
1) Neither the Font Software nor any of its individual components, in
53+
Original or Modified Versions, may be sold by itself.
6154

6255
2) Original or Modified Versions of the Font Software may be bundled,
6356
redistributed and/or sold with any software, provided that each copy
@@ -67,9 +60,9 @@ in the appropriate machine-readable metadata fields within text or
6760
binary files as long as those fields can be easily viewed by the user.
6861

6962
3) No Modified Version of the Font Software may use the Reserved Font
70-
Name(s) unless explicit written permission is granted by the corresponding
71-
Copyright Holder. This restriction only applies to the primary font name as
72-
presented to the users.
63+
Name(s) unless explicit written permission is granted by the
64+
corresponding Copyright Holder. This restriction only applies to the
65+
primary font name as presented to the users.
7366

7467
4) The name(s) of the Copyright Holder(s) or the Author(s) of the Font
7568
Software shall not be used to promote, endorse or advertise any
@@ -80,8 +73,8 @@ permission.
8073
5) The Font Software, modified or unmodified, in part or in whole,
8174
must be distributed entirely under this license, and must not be
8275
distributed under any other license. The requirement for fonts to
83-
remain under this license does not apply to any document created
84-
using the Font Software.
76+
remain under this license does not apply to any document created using
77+
the Font Software.
8578

8679
TERMINATION
8780
This license becomes null and void if any of the above conditions are

build/main.py

Lines changed: 45 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,51 @@
11
from collections import defaultdict
22
from datetime import date
33
from glob import glob
4-
from itertools import chain
4+
from itertools import chain, groupby
55
import json
66
from opencc import OpenCC
77
import os
88
import subprocess
99

10-
FONT_VERSION = 1.003
10+
FONT_VERSION = 1.004
1111

1212
# Define the max entries size in a subtable.
1313
# We define a number that is small enough here, so that the entries will not exceed
1414
# the size limit.
1515
SUBTABLE_MAX_COUNT = 4000
1616

17-
# This function is used to split a GSUB table into several subtables.
18-
def grouper(lst, n, start=0):
17+
# The following two functions are used to split a GSUB table into several subtables.
18+
def grouper(iterable, n=SUBTABLE_MAX_COUNT):
1919
'''
2020
Split a list into chunks of size n.
21-
>>> list(grouper([1, 2, 3, 4, 5], 2))
21+
>>> list(grouper([1, 2, 3, 4, 5], n=2))
2222
[[1, 2], [3, 4], [5]]
23+
>>> list(grouper([1, 2, 3, 4, 5, 6], n=2))
24+
[[1, 2], [3, 4], [5, 6]]
2325
'''
24-
while start < len(lst):
25-
yield lst[start:start+n]
26-
start += n
26+
iterator = iter(iterable)
27+
while True:
28+
lst = []
29+
try:
30+
for _ in range(n):
31+
lst.append(next(iterator))
32+
except StopIteration:
33+
if lst:
34+
yield lst
35+
break
36+
yield lst
37+
38+
def grouper2(iterable, n=SUBTABLE_MAX_COUNT, key=None):
39+
'''
40+
Split a iterator into chunks of maximum size n by the given key.
41+
>>> list(grouper2(['AA', 'BBB', 'CCC', 'DDD', 'EE'], n=3, key=len))
42+
[['AA'], ['BBB', 'CCC', 'DDD'], ['EE']]
43+
>>> list(grouper2(['AA', 'BBB', 'CCC', 'DDD', 'EE'], n=2, key=len))
44+
[['AA'], ['BBB', 'CCC'], ['DDD'], ['EE']]
45+
'''
46+
for _, vx in groupby(iterable, key=key):
47+
for vs in grouper(vx, n):
48+
yield vs
2749

2850
# An opentype font can hold at most 65535 glyphs.
2951
MAX_GLYPH_COUNT = 65535
@@ -142,7 +164,8 @@ def build_opencc_word_table(codepoints_tonggui, codepoints_font, twp=False):
142164
codepoints.update(codepoints_v)
143165

144166
# Sort from longest to shortest to force longest match
145-
return sorted(((k, v) for k, v in entries.items()), key=lambda k_v: (-len(k_v[0]), k_v[0])), codepoints
167+
conversion_item_len = lambda conversion_item: len(conversion_item[0])
168+
return sorted(entries.items(), key=conversion_item_len, reverse=True), codepoints
146169

147170
def disassociate_codepoint_and_glyph_name(obj, codepoint, glyph_name):
148171
'''
@@ -275,29 +298,34 @@ def insert_empty_feature(obj, feature_name):
275298
obj['GSUB']['features'][feature_name] = []
276299

277300
def create_word2pseu_table(obj, feature_name, conversions):
301+
conversion_item_len = lambda conversion_item: len(conversion_item[0])
302+
subtables = [{'substitutions': [{'from': glyph_names_k, 'to': pseudo_glyph_name} for glyph_names_k, pseudo_glyph_name in subtable]} for subtable in grouper2(conversions, key=conversion_item_len)] # {from: [a1, a2, ...], to: b}
278303
obj['GSUB']['features'][feature_name].append('word2pseu')
279304
obj['GSUB']['lookups']['word2pseu'] = {
280305
'type': 'gsub_ligature',
281306
'flags': {},
282-
'subtables': [{'substitutions': subtable} for subtable in grouper(conversions, SUBTABLE_MAX_COUNT)]
307+
'subtables': subtables
283308
}
284309
obj['GSUB']['lookupOrder'].append('word2pseu')
285310

286311
def create_char2char_table(obj, feature_name, conversions):
312+
subtables = [{k: v for k, v in subtable} for subtable in grouper(conversions)]
287313
obj['GSUB']['features'][feature_name].append('char2char')
288314
obj['GSUB']['lookups']['char2char'] = {
289315
'type': 'gsub_single',
290316
'flags': {},
291-
'subtables': [{k: v for k, v in subtable} for subtable in grouper(conversions, SUBTABLE_MAX_COUNT)]
317+
'subtables': subtables
292318
}
293319
obj['GSUB']['lookupOrder'].append('char2char')
294320

295321
def create_pseu2word_table(obj, feature_name, conversions):
322+
conversion_item_len = lambda conversion_item: len(conversion_item[1])
323+
subtables = [{k: v for k, v in subtable} for subtable in grouper2(conversions, key=conversion_item_len)]
296324
obj['GSUB']['features'][feature_name].append('pseu2word')
297325
obj['GSUB']['lookups']['pseu2word'] = {
298326
'type': 'gsub_multiple',
299327
'flags': {},
300-
'subtables': [{k: v for k, v in subtable} for subtable in grouper(conversions, SUBTABLE_MAX_COUNT)]
328+
'subtables': subtables
301329
}
302330
obj['GSUB']['lookupOrder'].append('pseu2word')
303331

@@ -341,6 +369,8 @@ def build_dest_path_from_src_path(path, twp=False):
341369
def go(path, twp=False):
342370
font = load_font(path, ttc_index=0)
343371

372+
# Determine the final Unicode range by the original font and OpenCC convert tables
373+
344374
codepoints_font = build_codepoints_font(font)
345375
codepoints_tonggui = build_codepoints_tonggui() & codepoints_font
346376

@@ -358,6 +388,8 @@ def go(path, twp=False):
358388
available_glyph_count = MAX_GLYPH_COUNT - get_glyph_count(font)
359389
assert available_glyph_count >= len(entries_word)
360390

391+
# Build glyph substitution tables and insert into font
392+
361393
word2pseu_table = []
362394
char2char_table = []
363395
pseu2word_table = []
@@ -367,7 +399,7 @@ def go(path, twp=False):
367399
glyph_names_k = [codepoint_to_glyph_name(font, codepoint) for codepoint in codepoints_k]
368400
glyph_names_v = [codepoint_to_glyph_name(font, codepoint) for codepoint in codepoints_v]
369401
insert_empty_glyph(font, pseudo_glyph_name)
370-
word2pseu_table.append({'from': glyph_names_k, 'to': pseudo_glyph_name})
402+
word2pseu_table.append((glyph_names_k, pseudo_glyph_name))
371403
pseu2word_table.append((pseudo_glyph_name, glyph_names_v))
372404

373405
for codepoint_k, codepoint_v in entries_char:

build/prepare.sh

Lines changed: 7 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,8 @@
11
#!/bin/sh
2-
mkdir -p output
3-
wget -q -nc -P cache https://github.com/ButTaiwan/genyo-font/releases/download/v1.501/GenYoMin.zip
4-
wget -q -nc -P cache https://cdn.jsdelivr.net/npm/opencc-data@1.0.4/data/STCharacters.txt
5-
wget -q -nc -P cache https://cdn.jsdelivr.net/npm/opencc-data@1.0.4/data/STPhrases.txt
6-
wget -q -nc -P cache https://cdn.jsdelivr.net/npm/opencc-data@1.0.4/data/TWPhrasesIT.txt
7-
wget -q -nc -P cache https://cdn.jsdelivr.net/npm/opencc-data@1.0.4/data/TWPhrasesName.txt
8-
wget -q -nc -P cache https://cdn.jsdelivr.net/npm/opencc-data@1.0.4/data/TWPhrasesOther.txt
9-
wget -q -nc -P cache https://cdn.jsdelivr.net/npm/opencc-data@1.0.4/data/TWVariants.txt
10-
cat cache/TWPhrasesIT.txt cache/TWPhrasesName.txt cache/TWPhrasesOther.txt > cache/TWPhrases.txt
11-
wget -q -nc -P cache https://gist.githubusercontent.com/fatum12/941a10f31ac1ad48ccbc/raw/59d7e29b307ae3439317a975ef390cd729f9bc17/ttc2ttf.pe
12-
wget -q -nc -P cache https://raw.githubusercontent.com/rime-aca/character_set/e7d009a8a185a83f62ad2c903565b8bb85719221/通用規範漢字表.txt
13-
unzip -q -n -d cache cache/GenYoMin.zip "*.ttc"
2+
mkdir -p cache output
3+
cd cache
4+
curl -LsSO https://github.com/ButTaiwan/genyo-font/releases/download/v1.501/GenYoMin.zip
5+
curl -LsSZ --remote-name-all https://cdn.jsdelivr.net/npm/opencc-data@1.0.5/data/{STCharacters.txt,STPhrases.txt,TWPhrasesIT.txt,TWPhrasesName.txt,TWPhrasesOther.txt,TWVariants.txt}
6+
curl -LsSo 通用規範漢字表.txt https://raw.githubusercontent.com/rime-aca/character_set/e7d009a8a185a83f62ad2c903565b8bb85719221/%E9%80%9A%E7%94%A8%E8%A6%8F%E7%AF%84%E6%BC%A2%E5%AD%97%E8%A1%A8.txt
7+
cat TWPhrasesIT.txt TWPhrasesName.txt TWPhrasesOther.txt > TWPhrases.txt
8+
unzip -q -n GenYoMin.zip "*.ttc"

0 commit comments

Comments
 (0)