Monday, July 21, 2014

Testing, testing. Can u hear me?

A friend posted recently about abbreviations of the word microphone. Apparently, abbreviating microphone to mic gives him a nervous tic. For him, mike is the proper abbreviation. It is not clear if he is in the pronunciation anti-mic camp--their argument is that mic is liable to be mispronounced as "mick" (like "tic")--or if his objection is solely because (his point) we abbreviate "bicycle" as "bike" and, therefore, should abbreviate "microphone" to "mike." Except the first "c" in "bicycle" is soft and the "c" in "microphone" is hard, making this a specious comparison. If bike is using the sound of the second "c" (and it is), then the type of abbreviation must be completely different. It is a false comparison. Those who take issue with the spelling because of potential mispronunciation make a good point, mic out of context likely would be mispronounced. But that is true of many words in English.

The English language is a harsh mistress. English is built upon and borrows liberally from many languages and cultures and, thus, has many inconsistent forms. There are precious few rules that apply to all of her subjects sweepingly. This is quite evident in our abbreviations.

Mike or Mic?

In fact, the common abbreviation for microphone in audio and engineering is mic. This shorthand of mic for microphone has been used by professional broadcasters and musicians and in equipment labeling for many years. Mic was used in printed texts (remember those?) at least as early as 1961. (It did not start with rap, as has been stated by a few commentators.) Interestingly, the verb form, to express the act of setting up a microphone, seems to be written somewhat interchangeably as both "to mic" and "to mike." Regardless of which spelling is used for the abbreviation as a verb, "mike" is used for its past tense, present third-person singular, and passive voice forms.

Stoddard mikes himself before he goes on air.
The engineer miked him already.
Streisand always is miked.

Who's your daddy?

Not only does English itself have a plethora of standard abbreviations, every field has its own set (and sometimes many sets) of abbreviations. Neither the comparative abbreviations nor comparative pronunciations approach is particularly useful in English to determine how abbreviations should be formed. There are some basic forms for abbreviating, but first, here are a few other interesting abbreviations and comparisons of abbreviations in English.

mother > mom, mommy
but
father > dad, daddy
number > num
but
amount > amt
and
quantity > qty

By the way, if something is countable, use quantity or number; if not, use amount.

telephone > phone
but
television > T.V. (sometimes teevee), now TV
satellite > Sat
but
Internet > Net
worldwide web > WWW > Web > web
I owe you > IOU

Did you think the shortcut of abbreviating "you" to "u" started with text messaging (texting), electronic mail (E-mail > Email > email), or social media (SM or sm, not to be confused with S&M, although SM can be both tortuous and addicting)? Think again. The IOU abbreviation using U for you has been around since the late 18th century.

Types of Abbreviations or Abbreviations of Type?

Dropping the end of a word to abbreviate it (deli, gym, mic) is called clipping. Dropping the beginning, as in telephone to phone, is called apheresis. Dropping letters from the middle (mgmt, fwd) is called contraction. The first two (and, arguably, contracting) are syllabic-based methods of abbreviating words.

Television to TV, International Business Machines to IBM, self-contained underwater breathing apparatus to S.C.U.B.A. (then SCUBA, now scuba), and Worldwide Web to WWW, are letter-based methods of abbreviating, which, typically but not always, abbreviate using the first letter of each significant word or syllable. Examples of exceptions are extensible markup language, which abbreviates to XML, and user experience, which abbreviates to UX. But, I beg you, for all that is holy, please do not write out these using a capital X! If I see one more eX- anything, I may run screaming for the eXit.

Letter-based abbreviations that form a pronounceable word, such as scuba, NATO, and radar are acronyms. Radar actually is a hybrid letter- and syllable-based abbreviation for "radio detection and ranging," forced somewhat in order to make a memorable acronym. Letter-based abbreviations that are not pronounceable as a word (IBM, CIA, WWW) are initialisms rather than acronyms. Note that this distinction is often missed as a result of massive misuse. I suspect that this conflation began with texts that show all abbreviations in a list of "acronyms" rather than in a properly-named list of "abbreviations and acronym" or, simply, "abbreviations." (All acronyms are abbreviations; not all abbreviations are acronyms.) Nonetheless, the distinction remains.

Some shortened words and names may not much resemble their longer forms, such as father shortened to dad or daddy. These particular abbreviations also happen to be hypocorisms. Hypocorisms are words that are for or about children or endearing 'pet' names. Daddy, like many hypocorisms, also adds a softening, singsongy -y sound at the end.

The mutability of English also is apparent in abbreviations. Dropping the periods in abbreviations is common in technology and communications, and it is becoming more common in the mainstream and academia. We now write PhD, NATO, and USA. But keep the periods in U.S. (US is a magazine, not a country.) Similarly, abbreviations, particularly acronyms, that do not represent a proper noun generally now are written in lowercase instead of uppercase letters. Hence, we have radar, scuba, laser, and pin instead of their unnecessarily bulky predecessors. Likewise, proper-name acronyms are moving toward only an initial capital (Unicef, Peta, and Fema).

And about those text shortcuts? That's TMI 4 2nite. BB4N.

Thursday, July 17, 2014

Phone Home?

A Wall Street Journal post highlighting the results of a Pew Research Center religious attitudes poll is making the rounds on social media today. The Pew Research poll results were released and published on their own site yesterday under the heading, "How Americans Feel About Religious Groups: Jews, Catholics & Evangelicals Rated Warmly, Atheists and Muslims More Coldly." Although I claim no particular insight into the veracity of their data or claims, I find frustrating a few things about this survey. Overall, both the WSJ snippet and Pew's own highlights remove so much context and methodology from the survey that it is difficult to get a feel for the significance of the results. Pew's methodology of differently categorizing respondents' professed religions and query religions seems awkward. Furthermore, omitting significant categories (according to their own categorization and numbers) certainly must skew the poll results.

Omitting significant categories results in questions a bit like: "When did you stop kicking your dog?" There is no good answer. The Pew report inquired about the following categories of religions, which I list here (and annotate) by group-size as a percentage of all those surveyed. Lest we compare apples and oranges, the numbers affiliated with a given category are also self-reported numbers according to Pew, available on their website.

Faith (% of adult adherents)
Evangelical (26.3)
[Roman] Catholic (23.9)
Mormon (1.7)
Jewish (1.7)
Atheist (1.6)
Buddhist (0.7)
Muslim (0.6)
Hindu (0.4)

But if we are doing comparisons, why these? Why, for example, are what Pew refers to as the "Mainline" Protestants (18.1%) not included? Or Orthodox Christians (0.6%)? Or Protestants affiliated with "Historically black churches" (6.9%)? For that matter, why are Evangelicals singled out among Protestantism? Evangelicalism is only one of the three groups into which Pew subcategorizes Protestants (51.3%), which, according to Pew, comprise Mainline, Evangelical, and Historically black churches. Looking more closely into the full survey results posted shows that Pew used "White evangelical," "White mainline," and "Black Protestant" categories to group respondents' declared affinities but did not use these same categories in the queries. In most surveys, the respondent categories and query categories are, indeed, different. When gathering data about religions according to religion, it would seem that having the same level qualification would be appropriate. As one small example, I am an Orthodox Christian. If you are not immediately familiar with Orthodoxy, sometimes referred to as "the best kept secret in Christendom," think: "My Big Fat Greek Wedding." Were I surveyed, how could I answer? I would question every query: Does the query include Orthodoxy as part of Roman Catholicism (which it is not, but with which it shares much), include it as one of the Protestant sects (which it is not, but with which is sometimes is associated by by folks familiar only with the break from Roman Catholicism of the Protestant Reformation and not the earlier Great Schism, which split Christianity into Roman Catholicism and Orthodoxy), or ignore it out altogether? My answers would be affected by not knowing what the questions intended. Perhaps my incessant querying of queries may be why I have never been surveyed by Pew. Or perhaps it is because these types of surveys repeatedly use the same limited population.

The Pew survey was based on a population of 3217 respondents in its "American Trends Panel" (ATP). Pew recruited ATP members based on a 2014 telephone survey of 10,000 "nationally representative" respondents. Pew points out in their materials that utilizing the ATP approach allows them to better track trends and changing views over time and predict behavior. They mean, of course, that they may be able to predict the behavior of the ATP members and that, with any luck, that behavior is generalizable to the larger population. Perhaps Pew has never surveyed me because (like many of my colleagues in technology and many young, single people) I have not owned a landline phone in a decade.

For comparison, the American Religious Identification Survey (ARIS) is a survey of religious adherents in the U.S. referenced by the U.S. Census Bureau. The 2008 ARIS included 54,461 respondents, and both the categories and the numbers differ from the Pew survey. One would assume differences over six years, of course--changes influenced by such factors as population increase, changing attitudes about religion, differences in generational religious affiliation, changing immigration patterns, and so on. While calculating percentages and comparing the two surveys, one huge discrepancy struck me. In 2008 (ARIS), only 0.94% of the population declared itself Evangelical Christian. According to Pew, the percentage is now 26.3. Did Evangelicalism really grow by 2786% in six years? Or are these kinds of surveys simply inherently flawed?

It is not merely the small population or the apparently mismatched categories that makes me question these findings. A very small population can produce great survey (or usability) results if that population truly is representative for the purpose, and most data findings have outliers. Pew claims that the ATP is nationally representative. However, based on my comparison with U.S. Census, CIA Factbook, and Gallup Poll data of their population demographics, the ATP appears to be un-"representative" of the U.S. population in almost every regard. With a potential population of almost 30001 respondents, it seems that Pew could shape any given ATP survey "round" (Pew's term for a particular survey instantiation) to be significantly more representative for the stated survey goals than this one appears to be. When asking about Mormons, Buddhists, and Muslims, for example, perhaps it would be appropriate to actually include Mormons, Buddhists, and Muslims. (Oh, my!)

1 Only 2941 responses are recorded for religious affiliation (or no affiliation). The only demographic that reflects Pew's stated 3217 responses is whether the respondent is male or female.

Tuesday, December 11, 2012

Pitches and Picas and Points. Oh my!

Except for a few folks who concern themselves with design and printing, most technical writers rarely have to think about things like picas and pitch these days. And few even have much say over point size. And that is a good thing--mostly. Modern word processors use predetermined styles to lay out everything. With technical publishing systems, design and layout is set up in templates. And in structured authoring, content creation is completely separate from output. But in all of these cases, someone is making determinations about stylistic considerations such as line length, even if those determinations generally are built into defaults and are invisible to content authors. Occasionally, it is necessary to jump into a "wayback" machine to figure out what's going on.

After not thinking about line length for quite some time, the topic came up today when I was asked about line wrap behavior for code blocks shown in documents. A "line" in code may be much longer than what can be displayed across one line on a page. Even when output is online only, it may be undesirable to show a long code line in a single line on the screen that requires the reader to scroll to the end. In this particular case, we decided that we would set a maximum at which the system will send a message notifying the content author that a code line will wrap in the output so that the content author can determine how to handle it. But in order to set up this threshold message, the tools developer needed to know how many characters would fit on a line in output.

In this case, it was a fairly straightforward calculation (once I was in "way back thinking" mode). First, because most of the customer output is to PDFs, which have a set size and "image area," the maximum line width is known. Second, because we mark all blocks of code with tags, which allows separate control of code output, and use a monospaced typeface for the output, the width of each character in the code is known. This is where pitch and points come in. (Ok, I confess, pica really has little to nothing to do with this, but it made my title alliterative and gave it the meter I wanted.)

What is the relationship between pitch and points?

Points (or point size) comes from typesetting. A point equals 1/72 of an inch. The point size of a typeface measures the height of its characters. (Points is also used to express the amount of space between lines--the leading [led'-ing].) Pica, by the way, is used in typesetting but rarely in computer typography. A pica is 12 points (1/6 inch).

Pitch in typesetting is shorthand for per inch. The pitch of a typeface is the width of its characters, expressed as the number of characters that fit into an inch. These calculations are based on fixed-width, monospace typefaces. Although the calculations work in general for proportional typefaces, keep in mind that characters are different widths in proportional typefaces; an m (for example) is much wider than an i. Taken another way, a line of 20 m's in a proportional typeface is longer than a line of 20 i's.
mmmmmmmmmmmmmmmmmmmm
iiiiiiiiiiiiiiiiiiii
In a monospace typeface, all characters take up the same amount of horizontal space, so the calculation works regardless of the content, i's or m's or other characters.
mmmmmmmmmmmmmmmmmmmm
iiiiiiiiiiiiiiiiiiii
For most standard fixed-width typefaces, you can calculate the pitch if you know the point size. 

120 / points = pitch
The most commonly-used typeface sizes, 12 and 10 points, are easy: 12pt = 10 pitch and 10pt = 12 pitch.

You also need to know how much space is available.

How many characters will fit on a line?

If you know the pitch of the (monospaced) typeface and the maximum width of that output (the image area), you can calculate the maximum characters per line (CPL) using this formula.

pitch * image area inches = CPL
In my example today, we use 8 point output for code blocks and we have a 6.5-inch image area. 

120 / 8 = 15
15 * 6.5 = 97.5
Thus, the maximum number of characters before a code line wraps is 97. The tools developer can set the message threshold to 97 to kick off a message to inform a content author that a code line will autowrap.

Caveat: This is a simplistic presentation of points and pitch that does not consider things such as kerning adjustments and non-standard typefaces. Nonetheless, for many issues in computer typography such as the one presented here, these simple calculations are sufficient.

Friday, June 22, 2012

Balancing Act

It is a straightforward equation: Ein = Eout
Energy output must equal energy input in order to maintain balance. Energy input (Ein), which in nutrition is represented in Calories (1 food Calorie = 1 kilocalorie), is the amount of one's "fuel" intake. A kilocalorie is the amount of energy required to raise the temperature of a liter of water 1ÂșC at sea level.
Energy output (Eout) is the total energy expenditure, which is a combination of a person's basal metabolic rate (BMR), the amount of energy used for physical activity, and the thermic effect of food. While the only Eout factor one can reasonably affect is one's level of physical activity, it is the BMR that represents the lion's share of Eout. BMR accounts for 60% - 70% of energy expended. The thermic effect of food (what you burn by eating and digesting food) is nominal, at about 10%.

Behind the curve or curve the behind?

Ahem. And why are women behind the curve in these weighty matters? It largely rests in the BMR calculations.
BMR (males) = (W x 10) + (W x 2)
BMR (females) = (W x 10) + (W x 1)
    where W = weight
An average, 170-pound man burns more than 2000 Calories each day without added exercise.
170-pound male, BMR = 1,700 + (2 × 170) = 2040 
A 170-pound woman burns fewer than 1900 Calories each day without added exercise.
170-pound female, BMR = 1,700 + (1 × 170) = 1870 
Men's bodies have a 9% higher BMR. The fact that men also tend to be denser (speaking strictly in terms of body mass here) and larger than women compounds the problem. So let's look instead at a 120-pound woman.
For a 120-pound woman, BMR = 1,200 + 120 = 1320. Therefore, an average, 120-pound woman can consume only 1,320 Calories without gaining weight.
So women are jilted on BMR.
What other factors affect BMR? First, BMR lessens as we age, and it lessens as muscle mass decreases (and increases as muscle mass increases), which generally occurs as we age. Again, men have an advantage here, as men tend to have larger muscle mass. Then there are some other factors that stack the deck. As if they need it, people who are tall and thin tend to have BMRs that are higher than the average. (Randy Newman may have been right in decrying the plight of "short people.") Stress eating? That is a double whammy, as while more Calories are consumed, mental or emotional stress simultaneously lowers BMR. Fad or temporary diets? They work for awhile, then have the opposite effect, as lower food intake and fasting decrease BMR.

"Let's Get Physical"

Although it may seem like the best way to control weight, exercise and physical activity contributes only 20 to 30 percent to Eout. Furthermore, like BMR, the net effect of physical activity hinges on other factors. Age, sex, height, and weight all figure into how many Calories are burned. A younger, taller, heavier man burns more Calories than a shorter, older, lighter woman. The good news here is that increasing the frequency of activity can have a positive effect. That is, the more active one is, the more Calories one burns for the same amount of activity. Going from inactive to active can increase the daily Calories burned by 21 percent; from moderately active to very active adds another 15 percent.

What is average weight?

There's no such thing. Realistic averages depend on many factors, including age and sex. However, the mean (half more, half less) weight, and, to a lesser degree, the mean height of all women and men is increasing. In the U.S., mean height of adult males increased over 4 decades from 5'8" (1960) to 5'9½" (2002), and mean weight increased from 166 to 191 pounds. For the same time period, female mean height increased from 5'3" to 5'4", and weight rose from 140 to 164 pounds. For women in my age category (I'm not telling, but you can figure it out from the sources), the mean weight in 2002 was more than 169 pounds, a 23 pound (15.5%) increase from 1960. The smaller overall percentage increase in women's weight gain should not be construed as weight gain slowing. Women's cumulative mean weight in all categories leapt more than 6 percent since the 1984-94 survery. Men's increase in the same period was signicantly lower, at 4 percent. This surely is not a good trend for either.

Overweight?

All that good news about men's advantage in weight maintenance? Men in the same age category were 16.5% heavier than in 1960. And while my category topped the charts with a mean body mass indicator (BMI) of 29.2, representing a 10.6% increase over 4 decades, the same age category of men weighed in at 28.7 BMI, a 12.1% increase. Body mass index is a measure of weight with regard to height. A normal BMI is between 18.5 and 24.9. The mean BMIs of 28.7 and 29.2 are considered "overweight," with over 30 being considered obese and over 40, morbidly obese. Because it factors only height and weight, and not lean muscle mass versus fat mass, BMI is not a good indicator for elite athletes, bodybuilders, or children. As I am not currently running daily due to an injury (from a car crash, not from running) and, thus, cannot consider myself an "athlete," BMI is a decent indicator for me. (Again, I'm not telling.) My BMI is on the high side of "normal," which is higher than I would prefer it to be.

And?

For me, reducing Ein simply does not work. I already fast (for religious reasons) several times a year. The only way to keep my BMR up is to not fast or dramatically reduce my caloric intake outside of those fasts. Neither does a flurry of activity at gym provide much benefit. The key for me is to include physical activity throughout my daily routine: take the stairs instead of the elevator, park in the farthest spot rather than the closest one; mow my own lawn. This is a big one, particularly in the heat of a Texas summer. Heat and cold extremes can increase BMR. And sleep. Never underestimate the power of a good night of sleep. Maybe when I advance to PhD candidacy that last one will be achievable. Sources: National Institutes of Health. Teacher's Guide: Information about Energy Balance. http://science.education.nih.gov/supplements/nih4/energy/guide/info-energy-balance.htm. Ogden CL, Fryar CD, Carroll MD, Flegal KM. 2004. Mean Body Weight, Height, and Body Mass Index (BMI) 1960-2002: United States. Advance data from vital and health statistics; no 347. Hyattsville, Maryland: National Center for Health Statistics.











Thursday, February 3, 2011

Begin with the End in Mind

Among the speakers at the Arkansas Comprehensive Literacy Conference in October of 2007, was Dr. Sonya Whitaker. On the title page of her presentation slides, Dr. Whitaker, who currently is an educational consultant with a Chicago firm, lists herself as "Director of Literacy," in (or, possibly, for?) Schaumburg, IL.

Seasoned conference speakers know that it is important to make an immediate impact to grab the attention of the audience, and it is not uncommon to open with a relevant quote or story to engage the audience. In Whitaker's presentation, entitled The Culturally Responsive Teacher and Leader, she takes this approach. On her second slide, Whitaker quotes NASA Administrator, Michael Griffin, as saying:
“Every time we fly I know that we can loose a crew. That occupies a large portion of my thoughts.” [sic]
As reported throughout the world, Griffin spoke these words at the August 8, 2007 launch of the Space Shuttle Endeavor. The Endeavor flight drew significant attention from the worldwide press as the first to carry a teacher-astronaut--then 55 year-old Barbara Morgan--since Christa McAuliffe perished along with the crew in the 1985 Challenger disaster. But the point here is not Griffin’s words, but Whitaker’s representation of them. If we take this slide at face value, Griffin apparently does not know how to spell “lose.” However, she is quoting Griffin’s spoken words here: the error is Whitaker's, not Griffin’s.

As one would expect, Griffin’s sobering words were widely reported by the global media at the time of Endeavor launch. An internet search reveals that the many reporters (or editors) who quoted Griffin's knew how to spell "lose." It is ironic that Whitaker, a “director of literacy,” made this mistake at a literacy-conference presentation.

To be fair, I suspect that Dr. Whitaker actually does know the difference between "loose" and "lose" and that this is merely a typographical error that she did not correct during her own editorial process. A closer look at Whitaker's slides reveals that the presentation date was October 16, 2007 and her copyright claim is 2006. It is likely that in her haste to update an older presentation with "current" material for the October 2007 conference, she simply overlooked the error.

Unfortunately for Whitaker, her mistake was immortalized in her presentation at the conference, which also is on the University of Arkansas at Little Rock Center for Literacy website.

This error illustrates two very important points. First, even if your spelling and grammar are exceptional, it is important to proofread and edit. It is difficult to see our own mistakes. Use spelling and grammar checkers and other available proofreading and editing tools on your written work to circumvent these kinds of mistakes. It often is difficult for an author to separate intent from result; we read what we intended to write rather than what we actually wrote. Get proofreading help from someone else whenever possible. Also, it is easier to see mistakes when not actively engaged in the development side of the writing process. Proofread your work at least a day after it is finished to look for errors that were not apparent during the writing phase. Second, little mistakes can have a big impact. Spelling errors, particularly in an age where we have so many helper tools available to us, can make an author or the represented organization appear ignorant or unprofessional. An apparent spelling error at the beginning of a presentation by an educational consultant, at a literacy conference, is a great “teaching moment.” It is likely to become a topic of conversation. A corollary to this second point is that what we say lives “forever” on the internet.

My apologies to Dr. Whitaker for making an example of what likely is a typing error. However, considering her title, the venue of the error, and the significance of the quoted material, this is a perfect illustration of the need to proofread and edit. Whitaker used her opening quote to illustrate an important point in her presentation, "begin with the end in mind." She would have done well to heed her own advice.

Tuesday, November 2, 2010

Commonly Confused Words

Partly due to the rich ethnic heritage of the language, there are many words in English that are confused with other words. We have words that sound the same but have different meanings such as fair, fare, and fair.

I recently attended the State Fair [event] of Texas, where the fare [cost] to ride the Ferris wheel seemed quite fair [just or reasonable], considering the stunning views of Dallas at night from the tallest Ferris wheel in the Western Hemisphere.

In this sentence, all three are homonymns (same pronunciation but different meanings). Fare is also a homophone of the other two (same pronunciation, different meaning and spelling).

Adding to the confusion potential, English also