Subscribe to RSS Subscribe to Comments

freesoftwhere.org

Friends, Romans, Italics: lend me your eyes! (for just a few minutes)

I should really subtitle this “part 1,” to fit into my established idiom of starting what’s meant to be a series and then letting the follow-ons stack up like cordwood in the Drafts folder. It’s worked well so far. I’m looking at several of them now … they seem so happy….

ANNYWAY, what I actually wanted to say is that I’ve got kind of an “ask.” I’m running an online survey as part of my PhD research, and I’d really be grateful if you’d give it a try. It’s about fonts; what you do is look at text samples and mark any stuff that looks bad. It’s designed to be something that you could finish entirely in less than half an hour (if you do all five samples), but you can do fewer if you want.

If you’re game and want to jump at it, the survey site is letter.fit — please go for it, and thanks!

The gist of this is that we want to collect responses from as wide a variety of readers as we can. It’s not about right answers or wrong answers. I’d love it if you’d give it a go and maybe spread the word, but if you don’t, that’s totally cool.

Just to be less self-promotey (although I certainly don’t reap any profit from it) for the unintrigued, and also to make for a more informative blog-read, I’ll say a tad more about what the survey does and why it’s a question. My apologies if you’ve listened to this song & dance twice….

What?

If you haven’t heard, I’ve been doing research about letter spacing in fonts: “good” spacing in text is a big part how of readability and legibility come together. But if you want to say anything “quantitative” about it — which would be nice, in these modern times — then you have to show people real text and get a feel for what they think of it. Those online games where you kern some letters a big, full-window-width word can be fun, but they can’t tell you this sort of stuff.

Thus, the test; designed as just a “look at this sample, highlight things that stick out” deal; the value is found by a wide spectrum of samples & lots of variety!

You highlight letters, you tag ’em, we all win. Does everybody see the same letter-pairs as incorrect? In the same order? What about eyesight or time-of-day? Let’s find out.

So, the back-end has experimental test-fonts in it. When you visit it, it shows you some randomly chosen sample pages with randomly-chosen fonts from the test pool.

What’s different about the fonts? Just how the spacing works. I can’t tell you more than that! You have to go into it unbiased or else it doesn’t add anything!

Is this about spacing algorithms or has something ELSE been varied in the fonts and layouts? No telling! Do *I* think some algorithms are better than others? Maybe but maybe not! Are all the letters changed, or are some of them the same? That’s off limits! Is it actually totally random and all you’re really measuring is reader frustration in online media as the seasons change? I can’t tell you!

(I can tell you it has samples in English, German, and French, and it ought to work with any browser, mobile or desktop. If you find a bug, please let me know!)

Experimentifying

Using it’s as simple as looking at the random text-and-font samples it shows you, then highlighting anything that you think looks too tightly spaced or too far apart. That’s it. You don’t have to give reasons, you don’t have to spend any particular amount of time on it.

In early trials, most people averaged 5 min with a sample; after that you’re not likely to see new stuff. But you can certainly take an hour if that’s what you want; there is no clock on it. It lets you look at five samples. After that, it stops.

You can restart it if you like, even immediately, but the “take a break” bit is built-in. There again, in earlier trials we noticed that it’s helpful to take a little breather time; you can start to get fatigued after too many samples in a row.

Figuring out that kind of stuff (and how to tune the trials so people can do it in a certain amount of time X) is all about test-design, and balancing the potential variables; it’s a pretty interesting rabbit hole to go down. I think the samples probably get easier if you do at least two — that is, at least if you’re not accustomed to inspecting font samples. Because once you see some differences, it “clicks” a little.

The site is totally anonymous, although it does ask you some general demographic & experience questions. It’s not tied to you in any way and there are no cookies (unless you’re eating some, which is definitely allowed).

Simulation of the data-analysis stage. Not to scale.

But the questions would let us compare & contrast on those various variables, which would allow us to see any patterns in what people report seeing — if such patterns emerge.

Which they might… or might not…. Do type-pros and laypeople see different issues? Or see the same but at different speeds? Or is there perhaps some effect but it’s way less significant than the noise level so it doesn’t affect anything? That’s why you have to measure.

Software

Anyway, that’s where we’re at. I’ve been running these tests for a while now, but I’m getting to the end of my allotted time, so I kinda want to push for making sure we collect as diverse of a data set as we can. The trends & the commonalities & differentiations are all what appears in the data-analysis stage, so the more people who give it a try, the better the information-crunching is.

It’s been super interesting putting the testing apparatus and the analytical pieces together; I can assure you that those aspects of it are going to be fodder for conference talks from me for quite some time. E.g., there are a lot of data-science libraries out there; take a wild guess how many of them support vectors of non-numeric characters as array indices for their plotting functions…. Or can deal with one-story and two-story “g” as being different tokens. Or have some method to graph heatmap data onto a string object such as a text-page. It’s fun stuff.

Is it going to result in me submitting new modules to Plotly or matplotlib? Not sure. Typographic research hooks might get in a lot of people’s way there so it’s rarely a simple patch; it also remains to be seen how best to generalize some of those bits and pieces.

Are you interested in a whole other rabbit-hole? Ask me about getting consistent color-schemes out of matplot, iGraph, Inkscape, and LibreOffice….

I also had to write a custom storage back-end for the mark DB. That part was a lot less fun. It involved JavaScript and other uncivilized notions.

But that’s for another time. If you want to take a look, please go ahead; I’d be grateful. Letter.fit. Your input is absolutely valuable — and I mean that, no matter who you are or how much/little you care about the minute details of typography. Oh, and also, please do feel free to share the site link around; it’s got OpenGraph and TwitterCards and stuff in it, so it looks fancy. I’m glad to get people from the FOSS universe represented in the eventual data set, but tell people everywhere. I’d be second-degree polynomially grateful if you shared the word. If not, absolutely no hard feelings.

Till next time; stay spaced out there.

N

(P.S. Comments are off on the actual blog site because some WordPress plugin in there is junk and mangles the CSS. ETA 2024 before I fix that. Or sooner if I nuke it & repave. Questions or comments, hit me on Mastodon, etc.)

Engaging with the OSI Elections 2022.1

Every year, the Open Source Initiative (OSI) holds an election to add a few new directors to its board of directors. This year, I decided to try and engage with that process, asking the candidates some real questions.

This is an account of that, which I’m hoping is food for thought for any and all other people who care about FOSS governance, and it’s also a record of what happened, since I expect that to be soon forgotten if I don’t write it down myself. It’s been more than a month, yeah, so … ancient history. I had a thing. Still, historical record and all.

Background

Ostensibly, these elections are serious affairs. The OSI is high-profile organization, with a robust list of Big Tech sponsor companies funding it. And “open source” as a term is the OSI’s property: the OSI is in charge of the trademark and defends it when it is misused; the OSI also maintains the formal “open source” definition and the list of licenses that you are permitted to call “open source”. That’s actually not a massive list of duties, so you might well wonder why the OSI is so high-profile to begin with; I think that’s one of the big philosophical questions, certainly worth revisiting (especially with regard to running for governance positions) but, for the time being, suffice it to say that it’s where we (“we” meaning “FOSS in the general sense”) are.

Nevertheless, these elections kinda just plod through without a lot of interest or engagement.  You might remember that in 2021, the election-management- or voting-system went a little haywire and the OSI had to redo the election. But, by and large, they aren’t really news when they happen. That’s in pretty stark contrast to the public back-and-forth that happens for Debian Project Leader (DPL) elections and the brouhaha over recent leadership “maneuvering” (scare quotes intentional) in the FSF.

The OSI board candidates can each write a candidacy-page text that gets put on the wiki, but it can say whatever they want. In short, to you the voter, there’s no genuine back-and-forth provided. No debates, no time allotted, no required position papers, etc. For the past few years, however, Luis Villa has made an effort to pose questions to the candidates. I think that’s great. Although not everyone answers, some do. But Villa is just one voter, and he doesn’t ask everything. And some of what he does ask is more of a “tell us in your own words” prompt than it is anything particularly drilling down on a specific point. So it’s not a real incisive process on those grounds.

Plus, this year I saw some things that concerned me about the ballot, and when I got the OSI’s “this much time until the election” email, I realized it was now or never. So this year was my first attempt to ask the candidates questions that I, as a vote-caster, wanted to know the answers to.

The ballot and the candidates

OSI membership itself is a little odd; I’m a student member, but I had to jump through a lot of hoops to even figure out how to join as a student member in the first place, and if I hadn’t been old-conference-pals with the then-OSI executive director I’m not sure I would have succeeded. Not a lot of info on the site; lotta dead links, etc. And even then, the process is basically trade-some-personal-email-based. That could use updating as well, but I digress.

The relevant point is that my “student” membership is a subclass of the “individual” membership. Individual members are people who have joined OSI of their own, personal accord. There is one other type of membership: “affiliate” membership, which is open to organizations (specifically, to other non-profit orgs, FOSS projects, user groups, and educational institutions). Affiliate members are institutions, but they are not the same as OSI corporate sponsors.

That matters because the OSI board is also a little odd, with four the seats chosen by the individual members and four chosen by the affiliate members, plus two more seats chosen by the other members of the board itself. Sponsors don’t get any seats. Add to that the fact that the individual-seat board members and the board-selected board members have a two-year term while the affiliate-seat board members have a three-year term.

Whatever. The relevant idea is that individuals and affiliates are separate memberships, and there is one election held for the individual-member board seats and a separate election held of the affiliate-member board seats. I can vote for “individual” seat candidates; I can’t vote for “affiliate” seats. The affiliate organizations get an organization-wide vote for the “affiliate” board seats; they do not get to vote on the “individual” board seats. And that’s by design.

Which brings us to this year’s candidates. Two individual seats were up for voting and two affiliate seats were up for voting. There’s an election page that currently has some of the details on it, but the pages of the candidates were only ever on the wiki (side note: don’t have two disjoint CMSes for your organization. It makes the baby pandas cry.)

[There used to be a blog post linking to all of the bio pages below but, for some reason I haven’t been able to track down an answer to, OSI pulled down that page– //blog.opensource.org/meet-osis-2022-candidates-for-board-of-directors — and it’s not archived in Wayback Machine. I’ll update that if it reappears. I locally archived all the links below before posting this, so if those also vanish, lemme know.]

The individual candidates running for the 2 individual-voter seats were:

Editorial note: at the moment, the above “candidacy page” links show the questions I eventually posed the candidates; feel free to not read those yet if you want to stay spoiler-free….

The affiliate candidates running for the 2 affiliate-voter seats were:

…and, just for comparison, the current board prior to Election Day was:

  • Aeva Black {individual} (Microsoft)
  • Megan Byrd-Sanicki {individual} – outgoing – (Google)
  • Catharina Maracke {individual} (Lawyer; lists several associations with unclear professional-affiliation status)
  • Josh Simmons {individual} – outgoing – (Salesforce)

plus:

  • Thierry Carrez {affiliate} (OpenStack)
  • Pamela Chestek {affiliate} – up for reelection
  • Hong Phuc Dang {affiliate} (FOSSASIA)
  • Italo Vignoli {affiliate} – outgoing – (The Document Foundation)

and:

  • Justin Colannino {board-appointed} (Microsoft)
  • Tracy Hinds {board-appointed; serves as Treasurer}

(Side note: it’s sometimes slightly tricky to come up with a real precise “employer”/”affiliation” tag to parentheticize next to each candidate; the meanings vary between individual and affiliate seats, and there are people who are self-employed, etc. It’s a spectrum.)

So, there we have it. As a matter of principle, I may care about the overall make-up of the OSI board, but as a voter I am offered input only on the “individual” board seats. So that’s where I focused my attention.

Issues

Before you even look at the candidate pages, a couple of thing stand out. First, there are two organizations who have multiple people running this year. There are two Red Hatters running on the same ballot, and there are two OpenUK ers running, with one of them running on the individual ballot and the other on the affiliate ballot. Slightly less noticeable is that there is a Googler running to fill a seat vacated by an outgoing Googler.

It’s clearly a problem for any one organization (for profit or otherwise) to hold on to a disproportionate number of seats on the OSI board. This is single-digit territory, mind you. That being said, some of these organizations are quite large and could hypothetically have people in totally disjoint divisions running, with different experiences, reporting to different supervisors, and operating in mutual isolation from one another. We just don’t know.

Furthermore, it’s arguably most problematic for the two OpenUK-ers who are running simultaneously for a seat on two different ballots, then somewhat less problematic for the two Red Hatters both running for an individual seat, and further less problematic for the Googler running as another Googler exits. That’s because the “individual” and “affiliate” voters are different constituencies; the individual-seat voters can see two Red Hatters on the ballot and say “yeah, we definitely need more diversity than THAT”, but the double-dipping maneuver goes more easily unnoticed when one candidate is presented to the individual voter block and the other candidate is presented to the affiliate voter block. That’s compounded by the fact that the affiliate “voters” vote at the organizational level; they’re more likely to cast those votes based on some internal organizational process that we just can’t know form outside.

The second thing that stands out is that one of the candidates for an “individual” seat is actually affiliated with one of the affiliate organizations. That’s a second red flag; the affiliate seats exist to give the affiliate organizations an institutional say in OSI operations. Affiliate employees and officers should definitely not be eligible to run for an “individual” seat for that reason. If they want to participate, they need to run for one of the affiliate seats that is set out in the bylaws for that purpose. Doubling down on the inappropriateness, this affiliate is OpenUK, who was also a red-flagged participant by virtue of running two candidates at the same time, on different ballots.

Moreover, the candidate in question (Brock) is, in fact, the CEO of the affiliate organization. Which is exponentially more problematic, even without the other two prior red flags.

The other thing to notice is that several candidates are employed by corporate sponsors of OSI. This, too, cedes a disproportionate degree of influence over OSI’s actions to one seat.

Last and certainly least — and I want to be generous in the way I phrase this — there are some candidates whose public profile maybe leaves a few open questions to be filled in. That’s not to say that one has to be formally associated with some Big Fancy FOSS HQ to participate in the OSI; far from it. In fact, I think it’s great that anyone can run, and I’m far more interested in the OSI hearing input from individual computer users than I am in giving yet another soapbox to a well-heeled commercial outfit that already has other ways to throw its weight around (and isn’t shy about using them). But some of the candidate’s candidacy pages are a tad sparse on detail, and it’s not immediately clear where they participate in the open-source community. So knowing more of that background information would go along way toward establishing how well-suited and interesting they ultimately are as a candidate.

So it’s a troubling ballot to look at. There’s an ostensibly non-profit organization that’s an official OSI affiliate trying to run its CEO as an individual candidate while also running a second member (a board director) on the appropriate, affiliate ballot in the same election. There’s also two financial sponsors running candidates on the individual ballot, one of them (Red Hat) running two candidates at the same time for the two open seats.

If you’re an individual voter at this stage, it looks rough. To be perfectly frank, OpenUK is violating multiple ethical principles here and clearly should not be allowed to run an officer (much less CEO) as an individual candidate. That, after all, in the entire reason that OSI has set out this distinction between “individual” and “affiliate” board seats in the first place; robbing the individual voters of their representation is blatantly wrong.

Only slightly less troubling is the organizations running multiple candidates at the same time; that raises questions about collusion or coordination on the part of the organization, which obviously would be unethical.

Finally, no financial sponsor ought to be allowed to run a candidate on the individual ballot unless that candidate will publicly agree to take steps to act solely on behalf of the individual voters, rather than the sponsor (I’m not trying to be overly specific about this risk; “recusal” and “conflict of interest” policy stuff is all I mean). This is an obvious-enough avenue to trouble that the OSI has theoretically adopted a conflict-of-interest policy … with a disclosure requirement. But they have not posted any disclosure statements since 2019. Why not? Dunno.

Assessment

So where do we go from here, if we’re an individual voter concerned about the governance of OSI? Well, Brock’s candidacy is egregiously unethical and should not have been allowed in the first place. But it has been allowed for 2022, so fixing that for the future is a matter of amending the broken bylaws … in the future. On the the sponsor-employee front, there is at least a conflict-of-interest policy to point towards and ask for clear answers. And on the multiple-candidates-from-one-organization front, although this also really does need to be fixed in the bylaws, we can at least raise the issue in public, and request some clear answers.

So that’s what I ultimately set out to do for 2022. But let’s save that (and an analysis of the candidacy pages, which factored into what questions I posted) for part II. This has gotten long enough already. Do feel free to contemplate what you would ask the candidates (and even to stay spoiler-free by not reading my own questions, if you’re that hardcore).

I know it’s easy to think that the little project-governance details don’t matter, but they’re important. Procedures matter; limits and checks on authority matter; equitable influence for all participants matter; honesty matters. Without an eye firmly fixed on those subjects, they can get eroded fast.

Computers and paper: BFFs forever [prologue]

To some degree, the stuff below is one of those “I’m writing this down somewhere so I don’t forget it myself” topics, of which I have [checks notes…] about fifteen draft posts currently gathering electron dust. But it’s also potentially of interest to one or two other people, so here we are.

For background, if you’ve been lucky enough to not hear about it before, I’m currently in a postgraduate research-degree program in typography. A lot of my research work day-to-day involves consuming & tracking info from printed documents. Despite what Silicon Valley’s OLED-Industrial Complex tells you, printed matter is not a low-quality substitute for digital ideals; it’s its own thing, and it comes with it own challenges when you use it as data.

And that’s a constant source of headaches. So I’m going to jot down some of the process and the challenges / open questions that I run across regularly in my research days; if I do that I might have to reinvent those wheels less often, and there’s always the chance that some other goofball out there will see it and find something helpful.

Le grand scheme

So here’s the rough picture. Ideally, I’d like to have all the source material that I collect accessible in the same place, so when the occasion arises I could sift through it in a couple of different fashions:

  1. Full-text searches – e.g., grabbing all the material that uses the term “sector kerning”, for some comparative reason
  2. Topical tagging – letting me attach my own labels to things, even if the label isn’t a word that’s found in the source doc, such as “disputed” or “real-time”
  3. Tracking usage – I ought to know if I’m actually using a quote from something in some written material. Technically that could be a tag, but it’s a little more involved.

For digital sources, this isn’t too terrible. Digital-original documents are pretty searchable from desktop or CLI tools (although you need a good directory structure). You definitely need to actually grab downloaded page-snapshots of web sites, though: people rearrange their personal and institutional pages ALL THE TIME; they bork up the URI structure and don’t bother to fix it, they change their domain name, etc.

You also have to actually fork (not star…) GitHub repositories and download them, cause people take down that stuff, too. And you cannot rely on the Internet Archive to Wayback-machine everything. It doesn’t always work, and the crawls stop at a finite depth. Pretty famously in the type world, the Typedrawers web forum got taken offline a few years ago, erasing literally decades of industry discussion. The site owners never got around to restoring it. But even before that, they halfheartedly converted a bunch of the threads to some new CMS, which broke (a) every Wayback Machine link and (b) broke every internal URL in every thread and (c) broke every “previous page | next page” link in every Wayback link that did exist. They still have not been brought to justice.

Anyway, I digress. That minor effort works fine for digital originals. It gets a lot harder for printed sources. This is where a real database-driven tool becomes mandatory. I’ve been using Zotero, which is fine as it goes although it has plenty of pain points. It is, at least, something that you can run entirely on your own machine (although they do try to rope you into using their hosted service, which you have to pay for if you go over the comically-small limit). And, obviously, it’s FOSS and runs on Linux machines.

The gist is, however, that you import every item into the Zotero library, and it handles all the metadata-level stuff at the top level. For each item you can store any number of files or notes. Great for having a source PDF, a translation, a set of notes, and an alternate version of the source you find months later with slightly different wording. Hooray.

A couple of things become tricky about getting the paper original into Zoteroized form. Here’s how I break it down:

  • If I have a scanned version of an article or other published document, it needs to be OCRed before the text is available for Method 1 (full-text search). Super hard.
  • If I have a LONG FORM physical book, then I can’t scan the whole thing and probably wouldn’t want to. In fact, it might not even be possible to, if it can’t be layed out flat.
  • If I have an un-published source, such as some personal archive material or correspondence, then I have to wrangle all the problems of the first bullet point but also manually do all of the metadata-like stuff, because Zotero cannot handle it at all. And it usually has to be photographed or manually transcribed like the stuff in the second bullet point.

Seems easy enough.

But there’s another dimension to look at.

The Secondth Dimension

That bullet point above really just looks at the presence of the item: do I have a scan? / do I have the physical item only? / do I not have the item at all, and have to make do with temporary access to it?

The other thing I’ve found is that you have to develop distinctly different workflows based on the potential uses of them item, after it gets into the digital system:

  • Things that you need to quote. This needs to be accurate. If it’s a digital original, great. If it’s a good scan, also reasonably doable. If it’s unscannable, that’s tricky: e.g., I need to just capture (accurately) the passage that I would / could quote, and link that as a “note” into the Zotero metadata-record for the item. That’s hitting #1 and #3 on the above numbered list in particular. Ideally I could photograph a passage and have that OCRed, rather than me re-typing it by hand.
  • Things you need to reproduce an image from.
  • Things you need to get an image from, but which you cannot reproduce the image. Generally that’s a licensing issue. People want compensation for almost any image that gets reprinted; for quoting a document there is a really high threshold to reach before that becomes a concern.

So that’s kind of a 3×3 matrix. And I keep saying “scan” in the generic sense, but that also doesn’t quite capture all the possible issues. For example, you could download/receive-from-a-third-party a digital original does not have text embedded, possibly making it essentially a PDF with just a TIFF image of the page in the middle.  And anything that you photograph or scan yourself can have all kinds of image corrections needed, such as skew or 3D warp (for the latter, consider the curvature of a book page when you’re holding down by the spine over a scanner). Those things can affect whether or not OCR is even possible. Actually, that can happen with material you get from elsewhere, too….

Then there’s all the problems of OCR itself: error rate, whether or not you can edit and correct errors, and so on. Plus whether or not you can OCR complex layouts, in more than one language. Luckily I’ve only had to deal with English + German + French so far, but it’s not nothing.

Finally, you get to the big weirdnesses that come with doing all this document processing for typography material. For example, typography articles often have lots of illustrations … of text. As is letters. And words. Even whole lines and paragraphs. Which need to be preserved as images, not converted to OCR. Or maybe you do also want them to be OCR; you might remember that an illustration was of the word “NAVVIES” and want to search for that.

Those are the pieces. You get varying workflows out of each of those combos, multiplied (possibly) by some of those later quirks. Anyway, more on that to come in the next post; like I said above this is mostly me jotting down things to record the workflows, but I’m trying to slice it into thinner chunks because otherwise I never finish writing the posts.

If you’re curious about any of the details, they’ll be in a follow-up. FORTUNATELY, if you’re a normal person and not, therefore, interested in the details, then at least now you know what I’m titling this little series of posts — so you can skip the next one with a maximum of comfort and ease.

Emojent behavior

Earlier today I saw a social-media post saying, essentially, “Microsoft ought to release its new emoji font as FOSS!” with the addendum that doing so would “give some competition to Noto,” which the post-writer claimed to love. Doesn’t matter who wrote it. You do see people ostensibly in the FOSS community say stuff like that on a fairly frequent basis.

For starters, though, begging for a proprietary software vendor to re-license its product under FOSS terms is, at best, a wild misinterpretation of Why Vendors Do What They Do. Microsoft doesn’t re-license products on a whim, or even because they’re asked nicely, and they don’t decide to make something open source / free software accidentally. When they do it, it’s because a lot of internal offices have debated it and weighed the options and all that other corporate-process stuff. I think that’s fairly well-understood, so let’s skip to the font-specific parts.

That original post elicits eye-rolls in part because it undervalues fonts and emoji, as if the ONLY way that end users are going to get something of quality is if a “better” (read: proprietary) project makes it for them and then takes pity and releases into the wild. It also elicits some eye-rolls because it smacks of “ragequit Google products”, although naturally it’s hard to know if that’s really happening behind the scenes or not. I’m pretty active on Mastodon, and one of the peculiarities of the “fediversal” mindset is that there are a lot of folks with a knee-jerk reaction of hating on any software project deemed too-close-for-comfort with one of the Big Suspicious Vendors. It can be hard to adequately extract & reframe material from that pervasive context. So who can say; maybe there’s none of that.

Un-regardless, the bit in the original post that’s most obviously and demonstrably off-base in the suggestion that Noto is in want of competition on the “FOSS emoji” front in the first place. I can think of four other FOSS-emoji-font projects off the top of my head.

But it got me thinking I wonder how many such projects there are, in total, since I’m certain I’m not up-to-date on that info. A couple of years ago, I made a list, so I at least had a start, but I decided to take a few minutes to catalog them just for procrastination’s sake. Consider this a spiritual sequel/spin-off of the earlier “how many font licenses are there” post. Here’s a rough approximation, loosely grouped by size & relationship:

  1. Noto Emoji (Color) [src] — the obvious one, referred to above.
  2. Noto Emoji B&W [same repo, different build] — which you might not be as familiar with. This is an archived black-and-white branch (think “IRC and terminal”) which is still available. Interested parties could pick it back up, since it’s FOSS.
  3. Blobmoji [src] — this is another fork of Noto, in color, but which preserves the now-dropped “blob” style of smiley/person. [Side note: most emoji fonts these days are color-only; I’ll point out when they’re not. Just flagging the transition here.]
  4. Twemoji [src] — This is the other giant, corporate-funded project (developed by Twitter) which everyone ought to be familiar with.
  5. EmojiTwo [src] — This is perhaps the biggest and most active of the not-part-of-another-project projects. It’s a fork of the older EmojiOne [src] font, which in classic fashion used to be FOSS, then got taken proprietary as of its 3.0 release.
  6. EmojiOne Legacy [src] — This is the last available FOSS(ish; depending on who you ask) version of EmojiOne, said to be c. version 1.5.4. As the name implies, not being developed. If you take a liking to it, clone the repo because it could go away.
  7. EmojiOne 2.3 / Adobe [src] — This is another rescue-fork (I think we need a better word for that; hit me up) created by Adobe Fonts, around EmojiOne 2.3.
  8. FxEmojis [src] — This is a no-longer-developed font by Mozilla, originally part of the FirefoxOS project. I tested a FirefoxOS phone back in the day. It was a little ahead of its time; perhaps the emoji were as well…?
  9. Adobe Source Emoji [src] — This is a black-and-white emoji font also by Adobe Fonts, originally designed for use in Unicode Consortium documents. Does not seem to be actively updated anymore, however.
  10. Openmoji [src] — This is a pure-FOSS project on its own, which includes both color and black-and-white branches.
  11. Symbola [src] — This is an older emoji font that predates a lot of more formalized FOSS-font-licensing norms. But it is still there.
  12. GNU Unifont [src] — Last but not quite least, Unifont is not a traditional font at all, but a fallback pan-Unicode BMP font in dual-width. It does, however, technically include emoji, which is quite an undertaking.
  13. Emojidex [src] — Last and certainly least is Emojidex, a fork-by-the-same-author of an older emoji font project named Phantom Open Emoji. Both the older project (despite its name) and the new one have a hard-to-parse, not-really-free, singleton license that I suspect is unredistributable and likely self-contradictory. But it seems like the license quirks are probably more to be chalked up to being assembled by a non-lawyer and not reviewed, rather than being intentionally hard on compatibility. So who knows. If you get curious, maybe it’d be an easy sell to persuade the author to re-evaluate.

I’m sure there are more. And that’s not even getting into other symbol-font projects (which are super popular, especially among chat & social app developers for flair and reaction stickers). Just Raw Unicode Code Point stuff.

Making an emoji font is a LOT of hard work. Maintaining one is a LOT of hard work, too. The visual assets take a tremendous amount of time to design and hone for consistency and test; the engineering and font-building process is extremely difficult (with different font-compilation toolchains and different source-file editors than other fonts, not to mention the fact that there are multiple binary formats and the files themselves are utterly massive in size when compared to other font binaries).

Most of the fonts above are not packaged for Debian/Ubuntu nor, I’d be willing to wager, for many other distributions. So there’s a big, unprotected barn-side project there. The Noto Color emoji font is, because, well, it builds, thanks to the toolchain the team maintains. Want to find one from a different source and revive it or freshen & update it?

All of the above projects are under-staffed. So if you actually care about FOSS emoji fonts, they’re where you should start contributing.

Happier users through fewer fonts

Frequent readers of my blog might (in between existential crises) recall that I’ve been devoting a lot of thought lately to how we can improve the font-selection process on desktop Linux systems. There are, for instance, lots of spots where the plumbing layer short-changes the user experience we can present to a GNOME user as compared to a mac OS or Windows user: we frequently don’t include all the necessary metadata in distro-packaged fonts, and as a result the contemporary generation of font managers — none of which are installed by default — don’t have much useful information to present that would let somebody decide that yes, in fact, they do want to install font X. The general system-package manager application has even less to show. Fair enough; we can work out what needs to be improved, open issues against the software projects and the font packages, and incrementally improve matters.

Fast forward. A couple of weeks ago, I sat in on the GNOME Metered-Data / Parental Guidance Hackfest at Red Hat in London. And I do mean sat in, because I contributed virtually nothing. (But it was a good experience; I’ve never lived close enough to a hackfest site before, so I was really wanting to see the whole feature-idea-to-implementation process, start to finish, up close. Very educational.)

ANYway, along the way I got into a handful of side conversations about fonts and script support and, in one of those, Richard Hughes pointed out how complicated font packages make matters for newer distro initiatives, like the Flatpak and Snap application formats and Fedora’s Silverblue. Because fonts want to be installed in protected system directories (at least when they come from the distribution) like /usr/share/, they punch holes in the security models that Flatpak, Snap, and Silverblue are in part driven by.

Tangent: Silverblue is one of those “immutable system image” distribution ideas, wherein the OS image is read-only and the user is generally meant to only modify it by installing per-user packages like Flatpaks, which are overlays that exist only in tightly defined sandboxy containers. The OS updates are transactional; you have to reboot the system to swap over to the new immutable OS filesystem. Ubuntu Core is the same model, give or take, and there are others (or I’m pretty sure there are; Ubuntu Core I know about because it’s been around a while. There used to be another Fedora incarnation called Atomic, and I think it’s been done other places I’m simply less familiar with.).

Trouble is, nobody wants to reboot their machine just to install a font. And as it turns out, you can’t just package a font in a Flatpak, either, because the sandbox keeps other sandboxed apps (i.e., the applications you might actually want to use the font in) from being able to see it. Snap gives you pretty much the same set of constraints. And, for some reason, the OS developers seem down on the idea that there would be hundreds of overlay filesystems installed, on a one-overlay-per-font basis.

Richard’s first thought was how we could streamline what’s included in a font package, but quickly enough we got to the question “what if the distributions stopped packaging fonts altogether?” Or, at least, what if the distros only worried about a small set of basic, language-support fonts in the core OS, and repurposed other fonts as user-level packages. Most desktop Linux boxes, we’re told, are single-user systems anyway … so if the font packages all installed to ~/.local/share/ then most users would never notice the difference.

It’s interesting to think about. Font managers for the desktop would only need to worry about one directory instead of two, and font packages themselves would be a lot more cross-distro-compatible. It would, however, mean altering all of those font packages, and that would be a ton of work. I’m not really clear if it’s possible for Debian packages or RPMs to install stuff into the user’s home directory anyway (legally speaking). If not, then you’d suddenly be talking about a different package format altogether.

And I get the impression that the system-engineering folks out there would be looking to offload the maintenance of the font packages themselves to a third party — but there really isn’t one that can take up the slack.

But consider the ancillary gains, too: gone would be the problem of “recommended font” sets (an untenable idea that occurs regularly); instead users in the know could devise and publish their own recommendations just like The Kids Today do with their Spotify playlists. If per-user fonts were the norm, distributions could phase out fonts that need superceding with a lot less pushback. And it would (at least in theory) be drastically easier to synchronize the set of installed fonts between two desktop machines running different distributions. It might even be possible to persuade more upstream font projects to maintain the “packages” themselves, reducing lag time.

This is not likely to happen. But it’s worth chasing down how it would work, I think, because it leads to some intriguing side effects. But enough from me: thoughts?

Next Page »