Blogs

Twenty-eight tips for research writing

Part of my job is to train students to write research papers. This involves me "teaching" technical writing, which I am definitely not trained to do. So I don't know what general training to give students to improve their writing. However, there are a lot of do's and don'ts that I like and don't like, respectively. These are (mostly) easy rules to follow that a student can (and should) check on their own. Please add comments to this post if you think of other helpful do's and don'ts.

  1. DO NOT start a sentence with a reference.
  2. DO NOT start a sentence with any number that is not spelled out. That is, don't say "34 sensors are placed in a circle..."; instead say "Thirty-four sensors are placed in a circle".
  3. DO tend to put citations at the end of a sentence rather than at the start. It's nicer form to say "Objects moving near a transmitter or receiver cause fading [6]", rather than "Reference [6] shows that objects moving near a transmitter or receiver cause fading". The latter makes it seem like you just want to check off of all of your references that you need to cite, rather than wanting to add the contributions of these other authors as the basis for the paper you're writing.
  4. DO build your research upon others' results. Hal Daumé (Utah SoC) reports being struck by a reviewer who commmented, "Other approaches don't have to be bad in order for your approach to be good." There's no need to put down other results, just show how your results or your research builds on the foundation of this past research.
  5. DO NOT excessively name the authors in papers when citing them. Say what they've contributed, then cite the paper.
  6. DO specify the contributions of the paper in at least one or two sentences when citing it. DO NOT cite lots of papers at once, unless the topic is only marginally related to the topic of your writing.
  7. DO write out in words any whole numbers below about fifteen when counting the number of something in a paper. I'm not sure exactly where the line is; but instead of, "There are 4 filters in Figure 2", say "There are four filters in Figure 2".
  8. DO put some concrete numerical results from your research in the abstract, for example, "We demonstrate a 35% improvement compared to the state-of-the-art method".
  9. DO NOT say "complex" when you really mean "complicated". The word "complex" should be used only when describing numbers which have real and imaginary components.
  10. DO NOT say "Rx" when you mean "receiver". The abbreviation "Rx" is for "prescription", which is probably not what you mean. The abbreviation "RX" can be used for receiver.
  11. DO NOT say "being done" when you mean "performed".
  12. DO NOT say "utilize" when you mean "use". In general, use simple language when it exists.
  13. DO NOT say "I". Not in technical paper. Even if you are writing a paper as a sole author, use "we". Al Hero, my former advisor, told me that it is the royal "we".
  14. DO say "we" to avoid excessive passive sentences, when talking about what we do in this paper. I imagine technical writing classes tell you not to, but in my experience, it often works well, and helps people read your writing.
  15. DO use acronyms when they are common. Or perhaps you are allowed to create one or two in your paper. Just be sure to spell out any acronym before its first use. Also, if you spell out the acronym in the abstract, you will have to do it again in the introduction.
  16. DO NOT excessively use acronyms. Don't use an acronym if you only use that acronym once, because then you have to spell it out anyway (and thus doesn't save you any space). If you've first defined an acronym in the Discussion or Conclusion section, it's probably too late to introduce that acronym.
  17. DO start each paragraph with a topic sentence that says what the paragraph is going to say. The following sentences in the paragraph support that topic sentence. Anything that doesn't support that topic sentence doesn't belong in the paragraph. If you find some odd sentence that you want to say, but it doesn't fit in that paragraph, put it in some other paragraph, make a new paragraph, or delete it (if it is unrelated to any topic in your paper, then you don't need it).
  18. DO finish each paragraph with a concluding sentence, which either ties the supporting sentences together, or provides a lead-in to the next paragraph. Or both.
  19. DO NOT start a section with a "The function blah is given by" followed by an equation. Each section requires a reason to exist, and that reason must be presented first. No equation's importance is self-evident.
  20. DO NOT use the phrase, "It means ...". You might instead write "Equivalently, ...", or "That is, ...".
  21. DO run a spell check! I know its a pain to "ignore" all of the math and LaTeX notation that the spell check catches. But you will find errors.
  22. DO NOT present data without discussion! This is a big one. Your job is not just to do the analysis and produce a beautiful figure or table. Your job is to tell the reader what it means. Sure, they could figure it out on their own. But most will not spend the time. Tell them what you learned from looking at the data and/or results.
  23. DO capitalize things that are named after a person, e.g., Fourier, and Gaussian. DO NOT capitalize a phrase just because it has an acronym associated with it. For example, if you're introducing radio tomogoraphic imaging (RTI), you can keep the words "radio", "tomographic", and "imaging" all lowercase, even though there is an acronym also being introduced. People will know where you got the letters "R", "T", and "I".
  24. DO NOT use non-technical language. That is, some words that are acceptable in spoken English are not acceptable in technical writing. For example, "totally", "good".
  25. DO learn about the crazy things we are supposed to do for the sake of the English language. I apologize for this language, there is no reason for some of these things, but here we are. Learn when to use "the" or "a" or nothing before a noun ("We present the maximum likelihood estimator..." vs. "We present an estimator..." vs. "We discuss estimation ..."). Do make sure that the subject and verb agree, when the subject is plural vs. singular, you need to match the conjugation of the verb to match. When capitalizing a title, learn which words are capitalized and which (short) words are not (for example, the word "of" is not in "University of Utah").
  26. DO look up tech words to see if your use is standard notation. Your spell check is probably useless; but Google isn't. For example, is "multi-path" hyphenated? Google gives "multipath" thirty times more results than "multi-path", so it gives me a good indication to kick the hyphen on this one.
  27. DO NOT use the word "clearly". In my opinion, this word is used by researchers who don't want to explain themselves. I've read it in papers, and quite often, have no idea why the conclusion is so clear.
  28. DO be as specific as possible, all things being equal. If an equivalent-length sentence could have been more specific, then use it instead.

Making animations by python

It is often necessary to make an animation from a series of figures to show your results to others. I find a useful python toolbox that can make this very easy. The following is a few steps to make an animation file by python.

1. First install the open source toolbox scitools from http://code.google.com/p/scitools/
2. In your python code, import the easyviz package: from scitools.easyviz import *
3. Then, save figures as files with names fig#.eps: savefig('fig001.eps')
4. Finally, make the animation: movie('fig*.eps', encoder='convert', output_file='animation.gif')

If you want to make an animation in other formats like mpeg, just change the encoder to 'ffmpeg' or other key words. The detailed information about this "movie" function can be found from the document of the scitools package.

Entropy of English

I just finished dusting off some Matlab code to estimate the entropy of English character sequences from a text source. In my opinion, this is a good tool to teach entropy rate. One might use the idea to calculate the entropy rate of another language, or other discrete-valued data source, like numerical data or twitter tweets. My code isn't particularly smart; my storage (and computation) is increasing as $L^N$ where $L$ is the number of characters considered and $N$ is the character sequence length. I'm sure someone more adept at programming can implement a more efficient version (perhaps a hash table?). However, the code does work, and computes the entropy for a sequence of characters (I've tested up to 4) from a given text file. I used Shakespeare's Romeo and Juliet, and found per-character entropies of 4.12, 3.73, 3.35, 2.99, for $L=$ 1, 2, 3, and 4, respectively. Info on how this is done is in my lecture 4 notes from today's Advanced Random Processes class; and the letter entropy Matlab code and Shakespeare text are also posted.

Python Sets

Python "sets" are a handy for keeping track of certain kinds of data, particularly when dealing with set theory or when you want to remove duplicate entries. Here's the description from the python documentation:

"A set is an unordered collection with no duplicate elements. Basic uses include membership testing and eliminating duplicate entries. Set objects also support mathematical operations like union, intersection, difference, and symmetric difference." Examples found here.

PhD Completed

My goal of obtaining a PhD degree in electrical engineering has been completed. I recently defended my dissertation, addressed the edits that my committee requested, and completed the edits from the thesis editor. Wow, can't believe I am actually done. I would say it's a relief to be done, and it is in some ways. But, I can't say that I will be happier now that it's over. Working with Neal Patwari and the SPAN lab here at Utah has been a huge blessing and a really great experience.

I applied to some other grad schools around the country before I started working with the SPAN lab: MIT, Stanford, Cal Tech, and USC. I got a quick rejection letter from all of them. At the time, I was pretty bummed to not get into one of these great schools. But in hindsight, I find it hard to believe that any of these options were a better fit for me than the SPAN lab. I got to work on exactly the kind of technology I was seeking, the challenges were perfect for my skill set, I drastically improved as an engineer through great classes and research, I worked with one of the best advisors around, I followed my aspirations of forming my own company, and had a great time all along. I don't know how things could have gone better for me over the past few years, and I feel really lucky and grateful to be a Utah-SPAN alum now.

A special note and thanks goes to Neal Patwari. Neal has been 100% supportive and helpful to me during this pursuit. Let's not forget that Neal was the first person to try radio tomographic imaging in wireless networks, and that became the foundation of my work. He gave me the freedom to do what I enjoyed, supported my efforts in starting Xandem, and kept me on track with his technical abilities. I nominate Neal for advisor of the decade. The SPAN lab is a world-class group, and I'm excited to continue my participation and support.

I would like to offer a few words of advice to any current or future PhD students who care. Hopefully I don't sound like a know-it-all, I just want to record some of my thoughts and things I've learned throughout this process.

1) When things get tough, just keep working and learning. There were so many times when I had felt like I hit a dead end and that I would never figure out how to move past a challenge. Months went by last year where I felt like I wasn't making any progress at all. Then, in about a 2 day span, everything changed and I was able to move very quickly.

2) Research needs time to breathe and you can't force it. You can't just sit down at your desk and say "I'm going to make a research breakthrough today." It needs a lot of time and subconscious processing. You have to work hard, but you also have to know that it also requires a lot of patience.

3) When things don't work, figure out why. Instead of saying "this idea just doesn't work," try to figure out exactly what assumption in your model or device is not holding up. If you can do that, I'll bet you make discoveries that are more important than your hypothesized model. If you don't know where you are weak, you can't get stronger. Don't be afraid to face up and acknowledge that you don't understand what is going on.

4) Don't get trapped into thinking that you aren't as good of an engineer as another, and don't act like you're better than everyone else. One problem we have as engineers is that we don't want to reveal that we don't understand something. We think that all the other engineers have everything figured out perfectly and have no intellectual weaknesses. We think that if we tell someone we don't understand, we're saying we're not smart. This leads to everyone as a collective whole feeling like they are inadequate, because most are putting on a front of intellectual superiority. Don't be guilty of acting like you know everything, and don't be guilty of feeling inadequate. Next time you don't understand something, smile and say "I don't get it!"

5) Do real-time experiments if possible. Playing with things and getting instant feedback can give you insights that you will never have if you save data and try to crunch it later.

6) Plow through peer review. Don't complain and whine about the reviews like I did. It's better to just accept them and address them quickly. I found that once I accepted the reviews, I could get the revisions done very quickly.

7) Enjoy it while it lasts. Now that I have been done with my PhD for a few weeks, I can promise you that finishing a PhD is not a magic token of happiness. If you can't learn to enjoy your work now, as a student, then you probably won't enjoy your work later. Figure out how to balance enjoyment now with future benefit (read the book called "Happier" by Tal Ben-Shahar).

Now that I'm done with my degree, I will be dedicated myself full-time to Xandem, our technology company that will develop device-free localization products. It's an exciting endeavor, but I also have to say it's very intense. I've only been done with my research for a few weeks now, and already I'm in many meetings a week ranging from core technology development to investment pitches. It's a good sign, as there are many out there who are interested in what we are doing, but there are plenty of people who are quick to criticize our efforts. I've heard "you may know how to invent technology, but you have no idea how to run a business", "I'm not saying you're not realistic, I'm saying you're not appealing", and "what you need is someone who really knows what they're doing." Ha! Luckily I have some positive comments to fall back on as well. "You're going to make a mint out of this", and "It's not very often that a company of your caliber comes around."

Thank you to everyone who has helped and supported me throughout this degree!

The "timeit" python function for quick computational comparisons

I recently found out about a handy python function called "timeit." It basically runs the function that you specify repeatedly, then tells you how fast your computer was able to run it. It's useful for comparing different functions. For example, say you want to know how long it takes your computer to add two numbers. You could do this:

>>> timeit 1.2+1.2
10000000 loops, best of 3: 27.3 ns per loop

So, this tells you that it ran the operation 10 million times, and the best 3 were done in 27.3 nanoseconds. Now, say I want to know how that compares to doing the exponential function e^1.2.

>>> timeit e**1.2
1000000 loops, best of 3: 250 ns per loop

This time it only ran the function one million times because each iteration takes 250 nanoseconds. So 250/27.3 is approximatly 9, so we conclude that adding 1.2 to itself 9 times faster than raising the natural number e to the 1.2th power, on my particular hardware and python installation. This can be very useful information when writing scripts that need to run efficiently.

Keep in mind you can run almost anything with timeit. A function that you have written, or a built in python function, are all possibilities.

Ubuntu 10.04 is coming

Automatic Bibliographies Aren't Automatic

Automatic download of bibliographic information is a great tool for keeping track of what you've downloaded, read and reviewed, and learned from published research. My favorites are Zotero and Google Scholar's Bibliography Manager, which (if you change your Scholar Preferences) shows you the BibTex for an article. However, at this stage, I have this warning: it is not yet automatic. That is, don't just take the BibTex entry as supplied. I notice lots of errors and typos, and they make a reference list look unprofessional. People can tell when they read a paper with bib file that was generated automatically and never reviewed.

For example, Google Scholar has chosen a seemingly random set of words in conference titles to be left lowercase, for example, leaving a conference proceedings titled "Proceedings of the 2nd international conference on Multi-hop, ad Hoc, and mesh Networks". Fix this so that all words are first letter capitalized except for a few common short words (e.g., "a", "on", "and", "of", "the"). Second, these managers don't seem to know that just because the paper appears in Citeseer, that its publisher isn't Citeseer. Or that an ACM conference was probably not located in New York, NY, even though the ACM is headquartered in NYC. Delete any publisher information from a conference proceedings paper, except for when the publisher is part of the name (e.g., "Proceedings of the IEEE International Conference on Blah Blah Blah"). Next, capitalization in titles should be consistent -- the first word in the title capitalized, but no other title words capitalized (except for proper nouns, and acronyms). You may need to put extra curly brackets around each proper noun or acronym to keep the capitalization like you have written in the title field. And you may need to fix the rest of the capitalization in the title field.

Maybe this is just my pet peeve, but I wouldn't bet on it. Look at and fix all BibTex entries as you add them, until the day comes when bibliographic managers have better data.

Online Security Tips

Recently a close friend of mine had their email account compromised. The extent of the damage is unknown at this point, but it has been very stressful. With so much of our personal data being accessible online, it is very important for all of us to step back and evaluate how we are protecting ourselves. In the spirit of trying to prevent this from happening again, I offer the following tips.

Make your password strong. Do not use your name, do not use words that are in the dictionary. Use capital letters and numbers mixed in. Don't be lazy with this one.

Don't use the same password on multiple sites. If you do, the owner of a malicious site can use your same password to get into your gmail account. For example, if you sign up for a web forum, use a unique password so that the owners of that site can't get into your account. Make sure you use different passwords for every site you use.

Use https whenever possible. If you use gmail, log into your account and click on "Settings" at the top. Go down to the line that says "Browser Connection" and make sure that you select "Always use https." Click save at the bottom of the page.

Do not use Internet Explorer. Especially version 6.

Use good antivirus software, and keep it up to date. If you're on Windows, make sure that you have the best antivirus available, and make sure you keep it up-to-date. Don't download and install anything that doesn't come from a reputable source.

Avoid using public computers, or even a friend's. Do not use computers that you believe COULD be infected with a virus. Every time you type your account name and password on a keyboard, you run the risk of exposing that data. Avoid checking your account from friends' computers where you have no idea how secure they are. It's like riding in the car with someone else driving. You put yourself at risk if the driver doesn't know what he's doing.

Be pessimistic. Default should be to not trust something.

Don't think that the system will protect you. You have to take responsibility for your own security at all times. The crosswalk is there to help you cross the street, that doesn't mean you can put your earphones on and close your eyes to cross, thinking that the rules of the road will make you immune to danger. Your own awareness is so important.

Be very careful logging into stuff from unknown wifi access points. If you don't see HTTPS in the URL, the information can be intercepted.

Be Careful With Social Networking Sites

Dear friends, family, and anyone else who comes across this,

Social networks like Buzz, Twitter, and Facebook are great tools to keep in touch with friends, but there are some dangers as well. If you are going to use social networking sites, you absolutely need to understand what the privacy implications of using the service are. Each social networking tool has it's own way of dealing with privacy and determining who can access the information you post. Educate yourself.

For example, let's look at Twitter. When you post a tweet, it's public information. People who don't know you, complete strangers, can see your post. This is very important as you decide whether or not to post information about yourself. Now, you can go into your settings and change that, but by default, Twitter is public. To double-check your Twitter settings, go to http://twitter.com/settings/account.

Facebook can be a dangerous as well. You generally have to be "friends" with someone for them to see your profile and your updates. This means you can update things without the entire world seeing them, but you should still be very careful. Often friends-of-friends can see information and you don't realize your updates are getting out to people you don't know. Furthermore, anyone on Facebook can post of photo of you, then mark your face with your name. This one really bothers me, because I don't have control of photos that get posted with my name being searchable. I'm often very tempted just to delete my Facebook account entirely. To check your Facebook settings, go to http://www.facebook.com/settings/?tab=privacy

Finally, there is Google's new social feature called Buzz. This is kind of a combination of Facebook and Twitter. You need to understand that if you reply to a public Buzz, your reply is public as well. If you post a public Buzz, the whole world can see it. Also, the people you follow and those following you may be publicly viewable as well. Buzz may also post information from your other Google services, like PicasaWeb. There are ways of doing "private" buzzes to only certain groups in your contacts, so you need to decide how you want to use it. To double check your Buzz settings, go here: http://www.google.com/profiles/me/editprofile

Take the time to educate yourself before using these services. I'm not saying that social networking is inherently a bad thing. I'm saying that it's bad if you're using social sites without understanding who has access to the information you post. As long as you understand the implications, you'll be much more likely to maintain your own privacy.

Syndicate content