Showing posts with label LPT730. Show all posts
Showing posts with label LPT730. Show all posts

Wednesday, September 24, 2008

LPT730 Lab #3 - Part 2, The Robot Exclusion Standard

The Robot Exclusion Standard is a voluntary standard by which web spiders and other automated downloading programs can avoid downloading content that's otherwise publicly available. The need for such a standard came about because search engines and other legitimate robot users attempted to download inappropriate content such as a cgi-bin directory containing programming code inappropriate for a search query. While this standard is voluntary, it's a good example of an imperfect solution on the Internet.

Given the current nature of how the Internet communicates, it's highly impractical if not impossible to hide content away from a subset of visitors to your web site. It would take nothing short of a redesign of basic protocols such as HTTP to make this happen. So a cooperative state has evolved where a website author creates a file on the site called robots.txt telling web-crawlers and other robots where they are and are not welcome. Here's an example of a robots.txt file that that asks all robots to refrain from download any files from the entire web site.
# Tells Scanning Robots Where They Are And Are Not Welcome
# User-agent: can also specify by name; "*" is for everyone
# Disallow: if this matches first part of requested path, forget it
User-agent: * # applies to all robots
Disallow: / # disallow indexing of all pages
Here's an example that asks crawlers to avoid download cgi code
User-agent: *
Disallow: /cgi-bin/
Disallow: /Ads/banner.cgi
Will doing this prevent all unwanted downloading? No. As it's a voluntary standard, some unscrupulous people will download whatever parts of your site they wish to. However, it still makes sense to create the exclusion file because the majority of users will obey it and thus a website owner can save significant money in download bandwidth and headaches by having a website that's under an appropriate load. For more information about the robot exclusion standard, try this FAQ or the references below.

References
--
The Web Robots Page - http://www.robotstxt.org/
Wikipedia - http://en.wikipedia.org/wiki/Robots.txt
Web Developer's Virtual Library - http://www.wdvl.com/Location/Search/Robots.html

Tuesday, September 23, 2008

LPT730 Lab #3 - Part 1, Phishing

The term "phishing" describes the act of trying to fraudulently acquire private and sensitive information from someone for criminal purposes by pretending to be a legitimate entity. A common example of phishing is an email that looks as if it came from your bank, informing you that your bank card has been accessed in some far away country and that you could be out some money. It's a common tactic for the message to try to prompt a strong emotional reaction (e.g., panic, fear or greed) from a potential victim. The message then points you to a link that when clicked, displays a web page asking you for your card number and PIN in order to verify your card's activity. But both the email message and the web page are fraudulent. They're designed to look exactly as if they've come from the actual bank. If you enter the information it won't be long before your account will be empty.

Early phishing attempts of this type could be detected by moving the mouse cursor over the link in the email message and looking at the control bar. If the web address displayed wasn't the bank's, you knew your were being lied to. But because today's email messages can have embedded javascript (programming code) that alters a browser's status bar, it can be almost impossible to detect a phishing attempt. Phishing doesn't have to occur on your computer. You could just as easily get a voice message from someone claiming to be your bank leaving a number to call back and because they use a voice-over-IP (VOIP) phone number and false caller ID information they could appear to be legitimate.

Some Tips to Help You Avoid Phishing Attacks
  • Don't click on links in an email to go to a website. Use your bookmarks or type a trusted address into your browser's location bar.
  • Don't call the phone numbers that come in emails. Use a number from your paper statement or from the company web site.
  • Update your web browser. Microsoft Internet Explorer 7 and Mozilla Firefox 2 or later contain anti-phishing features. These are the oldest versions you should be using.
For a more complete list of tips try here.

References
---
Anti-phishing working group - http://www.antiphishing.org/
The Phishing Guide - http://www.technicalinfo.net/papers/Phishing.html
Wikipedia - http://en.wikipedia.org/wiki/Phishing
RCMP - http://www.rcmp-grc.gc.ca/scams/phishing_e.htm
Repoting Economic Crime On-Line - http://www.recol.ca/

Wednesday, September 10, 2008

LPT730 Lab #1 - Part 2 - Bill C-61

What is Bill C-61?
Bill C-61 is an amendment to Canada's Copyright act intended to bring Canadian law into accord with treaty obligations agreed to when Canada joined the World Intellectual Property Organization (WIPO). If you go here you can get a good list of specific "dos", "don'ts" and potential fines. In simple terms, the argument for the bill is that content creators deserve to be fairly compensated for their work and that society should work against consumers who use the Internet and other technologies to freely share copyrighted content. The argument against states that this law's penalties are excessively punitive and would give the content creator the right to charge the consumer again and again for services that add no real value such as format shifting (e.g., music from a CD to a MP3 on digital player). To avoid confusion, the bill does allow "format-shifting" for any content without any digital rights management(DRM), but if any exists it's illegal to attempt to circumvent it. So in many cases format shifting would be illegal (e.g., copying a movie from a DVD and converting it into a smaller MP4 to watch on a notebook computer while traveling somewhere).

What is with Bill C-61?
Bill C-61 was about to become law but Canada's parliament was recently dissolved because an election was called. This happens to many bills that are excessively controversial. For those trying to get or remain elected this type of issue is a no-win situation. Because there are far more consumers of media than there are producers, it can only lose votes to a politician. Whatever side you come down on, I think it's safe to say that out next government will show equal skill in procrastinating on this issue.

References
C-61 - Text of the act
Wikipedia summary of bill C-61
Michael Geist's Fair Copyright For Canada blog

Tuesday, September 9, 2008

LPT730 Lab #1 - Part 1 - Software Patents

Software patents are confusing. A patent is basically a trade between an inventor and the government. The inventor discloses to the government exactly how the invention works and if it is unique, the government grants the inventor a monopoly to manufacture, sell and export the invention for a number of years (usually 20). Patents are designed to promote innovation by allowing the inventor to offset the high cost of inventing something with a guaranteed number of years to recoup costs and return a reasonable profit. While this system has worked very well for many industries, it has proven problematic when applied to software.

The first and most basic problem is the notion of locking up an innovation for 20 years makes no sense in the software world. Software evolves too quickly to be tied down for 20 years making many patents useless. An example of this is the patent granted to Unisys for its LZW compression used in GIF graphic image files. The patent is still in force and though Unisys has granted royalty-free licenses to many groups, it did no good. Unisys was universally hated for enforcing the patent and the software industry moved on to a patent-free technology (PNG).

A second problem is that patent offices have a very inconsistent record in evaluating what software is patentable. A good example is the patent granted to Amazon.com regarding it's 1-click technology. This technology is so technically obvious that granting one company exclusivity is clearly unfair to its competitors and indeed to society at large. But the patent was successfully enforced against an Amazon competitor (Barnes and Noble). This success has given rise to a third problem - patent trolls.

Patent trolls are unscrupulous companies who seek out and acquire and/or enforce patents solely on the basis of potential value through litigation. A recent example is a company called NTP which sued Research In Motion (RIM) and Palm Inc. (among others) claiming that their patent on mobile email was being violated, despite systems that existed in the public domain before NTP received its patents. The result was a big payday for NTP and wasted money, time and reduced innovation from RIM. Just as in the Unisys case, the goal of promoting innovation was hurt by the patent - exactly the opposite of what was intended.

Another problem with patents is that they are not universally applied around the world. Each nation has its own set of laws regarding what can be patented and how long a patent will last. For example, the European Community doesn't not allow patents for software while the United States has very broad guidelines regarding what software can be patented.

While there's been much discussion on how to change the patent system to deal more fairly with software, little concrete action has come about in North America. Until changes are made, this jurisdiction will have to endure the extra hassle and inefficiency that comes from regulating obvious innovations.

References
An article from LawMart.com about the differences between copyright, trademark and a patent
Wikipedia Article on Software Patents
Wikipedia Article listing notable Software Patents

Sunday, September 7, 2008

LPT730 Lab #0 - Part 2 - Two Pieces of Software I Regularly Use

Myth TV is a homebrew personal video recorder (PVR) that let's your Linux computer greatly enhance your TV viewing experience. While writing this, I'm watching a movie called Adam's Rib staring Spencer Tracy and Katharine Hepburn. I recorded this movie on August 31 knowing I wouldn't have the time to watch it until much later. This kind of time-shifting hasn't been new since the VCR was invented, but for my Myth box this is just the tip of the iceberg. While watching this movie I'm also recording the Belgian Grand Prix and flagging the commercials on another recording made last night. I can also:
  • pause, rewind, play in slow-motion (both recordings and live-TV)
  • skip forward to pass commercials or boring bits
  • skip back to review something I didn't get the first time
  • put multiple TV tuner cards in my computer (currently I have 3) and record many channels at once
  • burn what I record onto a DVD
  • play and archive my DVDs
just to name a few. For a more complete list click here. It's hard to convey how much better TV is when not encumbered by commercials and the strict constraints of a broadcast schedule. I really notice the difference when I visit a friend with regular TV and sit there biding my time, silently reciting some mantra during the commercials or stifling my annoyance at the fact that I just missed some important detail and can't rewind. Such rank inflexibility could only be designed by the advertising community and I see the PVR as a natural and sane response by the viewing public. At this point I would most likely stop watching TV if no PVR were available. Here's a screen capture showing some of my media library.


iTunes on Windows is one of the last proprietary applications I still use. While I've tried gtkpod and YamiPod, I've found them both more hassle then they're worth. You can say what you like about Apple's proprietary business model and expensive hardware, but they design software that satisfies my needs and requires virtually no training or learning curve. Here's a snapshot of my music library in iTunes.


I surmise that one day a team will put together a Linux distribution with this much ease of use as a primary goal. That would be a good thing.

Wednesday, September 3, 2008

LPT730 Lab #0 - Part 1 - A Bit About Myself

I've a degree in geography and skills in developing and using Geographic Information Systems. I'm self taught in some older technologies such as xBase and VB (up to version 6 / pre .Net) and Microsoft Access. My interest in computers and open source is both practical and philosophical.

Computers are likely the most flexible tools for managing information ever invented. I think back to the days before the Internet, Google and Wikipedia (to name a few) and I remember how so many arguments were won by the person who yelled the loudest and longest. Life is saner today but by no means perfect. While these new technologies ease the aggregation and communication of knowledge, they also bring about a new set of problems - information overload for example. Today it's easy to get sidetracked by:
  • superfluous details (too much information)
  • lies and half-truths (false information, misinformation)
  • errors (incorrect information)
  • obsolete information
I find myself constantly developing new techniques and learning skills in order to better navigate around these ever-present distractions. But would I go back to the way it was? No way. From poker to recipes to personal finance to finding the closest beer store that's still open, the Internet has greatly contributed to my knowledge and appreciation of the world. I like to use open source software (OSS) when I can because it generally does a better job helping me get my work done and almost always for a much lower cost. But as important a reason as this is greater freedom of choice. OSS file formats are better documented, letting me more easily migrate my data to another software tool should I need to. OSS gives me access to the source code, so many small changes can be made with my own resources. I may not be able to fix the code myself but I can hire someone to fix a bug or add a small feature and thus my problem gets solved. I have no such recourse with proprietary software. With OSS, no vendor is forcing me to "upgrade" to a new version with "features" that I don't need just so I get a security fix and the vendor can meet their quarterly sales quota. OSS isn't perfect, but I find it's constantly improving and I've never had cause to regret my choice to use it.