[NIFL-TECHNOLOGY:2665] Re: measuring hypertext links chosen

From: Steve Linberg (steve@silicongoblin.com)
Date: Sun Nov 17 2002 - 21:57:22 EST


Return-Path: <nifl-technology@literacy.nifl.gov>
Received: from literacy (localhost [127.0.0.1]) by literacy.nifl.gov (8.10.2/8.10.2) with SMTP id gAI2vLX09293; Sun, 17 Nov 2002 21:57:22 -0500 (EST)
Date: Sun, 17 Nov 2002 21:57:22 -0500 (EST)
Message-Id: <Pine.LNX.4.21.0211172143310.26764-100000@shagrat.silicongoblin.com>
Errors-To: listowner@literacy.nifl.gov
Reply-To: nifl-technology@literacy.nifl.gov
Originator: nifl-technology@literacy.nifl.gov
Sender: nifl-technology@literacy.nifl.gov
Precedence: bulk
From: "Steve Linberg" <steve@silicongoblin.com>
To: Multiple recipients of list <nifl-technology@literacy.nifl.gov>
Subject: [NIFL-TECHNOLOGY:2665] Re: measuring hypertext links chosen
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
Content-Type: TEXT/PLAIN; charset=US-ASCII
Status: O
Content-Length: 2890
Lines: 64

On Sun, 17 Nov 2002, Tommy B. McDonell wrote:

> Hello. I am doing a dissertation that involves hypertext and am looking for
> something that will measure the times that a student choses a hypertext
> link. For example, if you are reading a 5 page article online with 67
> hyperlinks, I would like to know how often a student choses a link, and
> perhaps how long he or she stays on that link. I am not interested in a
> program that I have to reprogram. Thanks for any help.

Hi Tommy.

The quick answer is: you can't get perfect detail on what you're looking
for with the architecture of the web, unless you're operating in a closed
environment where you can observe.

Assuming that's not the case, what you need is access to the web server's
logs.  Your ISP should hopefully be able to provide this for you.  (I'm
also assuming you're talking about analyzing usage on your own sites, or
someone's site who's willing to share all of the information with you.)
You can then run this through a logfile analyzer (like AWStats or
WebTrends or others of that ilk) and deduce *some* information; every
request to the server is logged with the IP address of the requestor and
the date and time.  If you want to make the assumption that hits from the
same IP address are the same person, you can then look at the time
intervals and make some guesses about low long that person "stayed" on a
page - although you can't know, of course, whether they were intently
reading that page or whether they got up to walk the dog during that time.

You'd ideally want to cross-reference this information with referer
information, which can also be made part of the server logs if you ask
your ISP, so you can tell where a viewer came from when viewing
pages.  You can use this to attempt to recreate a trail.

One huge potential problem with logfile analysis is making the assumption
that IP addresses are unique.  They often are, but there are at least two
huge cases where they aren't:

1. Proxy servers like AOL, where 500 visitors might all have the same IP
address because they're coming from an internal proxy, and

2. Dial-up IP addresses being re-used.  A user might connect with a
certian IP, and then disconnect, and a short time later that IP might be
reassigned to another user, but you wouldn't be able to easily tell the
difference.

Proxy servers can also interfere with your ability to detect hits - if
your pages are cached by a proxy server, users behind the proxy might just
navigate the copies once they've been loaded once, and you'd never see the
traffic and this would affect your sample.

I don't know how exact you need your data to be, but these are some of the
things you need to keep in mind when planning.

Cheers,

Steve


-- 
Steve Linberg, Chief Goblin 
Silicon Goblin Technologies 
http://silicongoblin.com 
Be kind.  Remember, everyone you meet is fighting a hard battle. 



This archive was generated by hypermail 2b30 : Fri Jan 17 2003 - 14:44:49 EST