Unicode and vulgar fractions
This little snippet will scan a block of text and replace fractions with the unicode vulgar fraction equivalent when possible, removing spaces if the fraction is immediately after another number. For example "1 1/2" will be replaced by "1½".
import re FRACRE = re.compile(ur'(\d\ +)?(\d+/\d+)\b') FRACTIONS = { u'1/2' : u'\u00BD', u'1/4' : u'\u00BC', u'3/4' : u'\u00BE', u'1/3' : u'\u2153', u'2/3' : u'\u2154', u'1/5' : u'\u2155', u'2/5' : u'\u2156', u'3/5' : u'\u2157', u'4/5' : u'\u2158', u'1/6' : u'\u2159', u'5/6' : u'\u215A', u'1/8' : u'\u215B', u'3/8' : u'\u215C', u'5/8' : u'\u215D', u'7/8' : u'\u215E', } def fractions(s): def subfrac(m): pre, post = m.groups() frac = FRACTIONS.get(post) if frac is None: start, end = m.span() return m.string[start:end] if pre is not None: frac = pre[0] + frac return frac return FRACRE.sub(subfrac, s)
First at all, when the function must return the result group, it’s more direct to use m.group(0):
def fractions(s): def subfrac(m): pre, post = m.groups() frac = FRACTIONS.get(post) if frac is None: return m.group(0) ...But the algoritm can be improved with a more sophisticated regex, putting off all innecesaries calls to subfrac:
FRACRE=re.compile(ur'(?:(?< =\d) +)?('+ '|'.join(FRACTIONS.keys())+ ur')\b') def fractions(s): def subfrac(m): return FRACTIONS[m.group(1)] return FRACRE.sub(subfrac,s)Comment by Chema Cortés — 2005-04-19 @ 4:40 am
Sorry for the incompleted post. You can see the fully formated code at this comment.
Comment by Chema Cortés — 2005-04-19 @ 4:43 am