In February 2003, Google acquired Pyra Labs, owner of Blogger, a pioneering and leading web log hosting website. Some analysts considered the acquisition inconsistent with Google's business model. However, the acquisition secured the company's competitive ability to use information gleaned from blog postings to improve the speed and relevance of articles contained in a companion product to the search engine, Google News.
At its peak in early 2004, Google handled upwards of 84.7% of all search requests on the World Wide Web through its website and through its partnerships with other Internet clients like Yahoo!, AOL, and CNN. In February 2004, Yahoo! dropped its partnership with Google, providing an independent search engine of its own. This cost Google some market share, yet Yahoo!'s move highlighted Google's own distinctiveness, and today the verb "to google" has entered a number of languages (first as a slang verb and now as a standard word), meaning, "to perform a web search" (a possible indication of "Google" becoming a genericized trademark).
Wednesday, November 11, 2009
Financing and initial public offering
The first funding for Google as a company was secured in August 1998 in the form of a $100,000USD contribution from Andy Bechtolsheim, co-founder of Sun Microsystems, given to a corporation which did not yet exist.
On June 7, 1999, a round of equity funding totalling $25 million was announced[25]; the major investors being rival venture capital firms Kleiner Perkins Caufield & Byers and Sequoia Capital.
In October 2003, while discussing a possible initial public offering of shares (IPO), Microsoft approached the company about a possible partnership or merger.[citation needed] However, no such deal ever materialized. In January 2004, Google announced the hiring of Morgan Stanley and Goldman Sachs Group to arrange an IPO. The IPO was projected to raise as much as $4 billion.
On April 29, 2004, Google made an S-1 form SEC filing for an IPO to raise as much as $2,718,281,828. This alludes to Google's corporate culture with a touch of mathematical humor as e ≈ 2.718281828. April 29 was also the 120th day of 2004, and according to section 12(g) of the Securities Exchange Act of 1934, "a company must file financial and other information with the SEC 120 days after the close of the year in which the company reaches $10 million in assets and/or 500 shareholders, including people with stock options." Google has stated in its annual filing for 2004 that every one of its 3,021 employees, "except temporary employees and contractors, are also equity holders, with significant collective employee ownership", so Google would have needed to make its financial information public by filing them with the SEC regardless of whether or not they intended to make a public offering. As Google stated in the filing, their, "growth has reduced some of the advantages of private ownership. By law, certain private companies must report as if they were public companies. The deadline imposed by this requirement accelerated our decision." The SEC filing revealed that Google turned a profit every year since 2001 and earned a profit of $105.6 million on revenues of $961.8 million during 2003.
In May 2004, Google officially cut Goldman Sachs from the IPO, leaving Morgan Stanley and Credit Suisse First Boston as the joint underwriters. They chose the unconventional way of allocating the initial offering through an auction (specifically, a "Dutch auction"), so that "anyone" would be able to participate in the offering. The smallest required account balances at most authorized online brokers that are allowed to participate in an IPO, however, are around $100,000. In the run-up to the IPO the company was forced to slash the price and size of the offering, but the process did not run into any technical difficulties or result in any significant legal challenges. The initial offering of shares was sold for $85 a piece. The public valued it at $100.34 at the close of the first day of trading, which saw 22,351,900 shares change hands.
Google's initial public offering took place on August 19, 2004. A total of 19,605,052 shares were offered at a price of $85 per share. Of that, 14,142,135 (another mathematical reference as √2 ≈ 1.4142135) were floated by Google and 5,462,917 by selling stockholders. The sale raised US$1.67 billion, and gave Google a market capitalization of more than $23 billion. The vast majority of Google's 271 million shares remained under Google's control. Many of Google's employees became instant paper millionaires. Yahoo!, a competitor of Google, also benefited from the IPO because it owns 2.7 million shares of Google.
On June 7, 1999, a round of equity funding totalling $25 million was announced[25]; the major investors being rival venture capital firms Kleiner Perkins Caufield & Byers and Sequoia Capital.
In October 2003, while discussing a possible initial public offering of shares (IPO), Microsoft approached the company about a possible partnership or merger.[citation needed] However, no such deal ever materialized. In January 2004, Google announced the hiring of Morgan Stanley and Goldman Sachs Group to arrange an IPO. The IPO was projected to raise as much as $4 billion.
On April 29, 2004, Google made an S-1 form SEC filing for an IPO to raise as much as $2,718,281,828. This alludes to Google's corporate culture with a touch of mathematical humor as e ≈ 2.718281828. April 29 was also the 120th day of 2004, and according to section 12(g) of the Securities Exchange Act of 1934, "a company must file financial and other information with the SEC 120 days after the close of the year in which the company reaches $10 million in assets and/or 500 shareholders, including people with stock options." Google has stated in its annual filing for 2004 that every one of its 3,021 employees, "except temporary employees and contractors, are also equity holders, with significant collective employee ownership", so Google would have needed to make its financial information public by filing them with the SEC regardless of whether or not they intended to make a public offering. As Google stated in the filing, their, "growth has reduced some of the advantages of private ownership. By law, certain private companies must report as if they were public companies. The deadline imposed by this requirement accelerated our decision." The SEC filing revealed that Google turned a profit every year since 2001 and earned a profit of $105.6 million on revenues of $961.8 million during 2003.
In May 2004, Google officially cut Goldman Sachs from the IPO, leaving Morgan Stanley and Credit Suisse First Boston as the joint underwriters. They chose the unconventional way of allocating the initial offering through an auction (specifically, a "Dutch auction"), so that "anyone" would be able to participate in the offering. The smallest required account balances at most authorized online brokers that are allowed to participate in an IPO, however, are around $100,000. In the run-up to the IPO the company was forced to slash the price and size of the offering, but the process did not run into any technical difficulties or result in any significant legal challenges. The initial offering of shares was sold for $85 a piece. The public valued it at $100.34 at the close of the first day of trading, which saw 22,351,900 shares change hands.
Google's initial public offering took place on August 19, 2004. A total of 19,605,052 shares were offered at a price of $85 per share. Of that, 14,142,135 (another mathematical reference as √2 ≈ 1.4142135) were floated by Google and 5,462,917 by selling stockholders. The sale raised US$1.67 billion, and gave Google a market capitalization of more than $23 billion. The vast majority of Google's 271 million shares remained under Google's control. Many of Google's employees became instant paper millionaires. Yahoo!, a competitor of Google, also benefited from the IPO because it owns 2.7 million shares of Google.
History of Google
Google began in January 1996 as a research project by Larry Page and Sergey Brin, a Ph.D. student at Stanford[1] working on the Stanford Digital Library Project (SDLP). The SDLP's goal was “to develop the enabling technologies for a single, integrated and universal digital library." and was funded through the National Science Foundation among other federal agencies. In search for a dissertation theme, Page considered—among other things—exploring the mathematical properties of the World Wide Web, understanding its link structure as a huge graph. His supervisor Terry Winograd encouraged him to pick this idea (which Page later recalled as "the best advice I ever got") and Page focused on the problem of finding out which web pages link to a given page, considering the number and nature of such backlinks to be valuable information about that page (with the role of citations in academic publishing in mind). In his research project, nicknamed "BackRub", he was soon joined by Sergey Brin, a fellow Stanford Ph.D. student supported by a National Science Foundation Graduate Fellowship. Brin was already a close friend, whom Page had first met in the summer of 1995 in a group of potential new students which Brin had volunteered to show around the campus. Page's web crawler began exploring the web in March 1996, setting out from Page's own Stanford home page as its only starting point. To convert the backlink data that it gathered into a measure of importance for a given web page, Brin and Page developed the PageRank algorithm. Analyzing BackRub's output—which, for a given URL, consisted of a list of backlinks ranked by importance—it occurred to them that a search engine based on PageRank would produce better results than existing techniques (existing search engines at the time essentially ranked results according to how many times the search term appeared on a page). A small search engine called RankDex was already exploring a similar strategy.
Convinced that the pages with the most links to them from other highly relevant Web pages must be the most relevant pages associated with the search, Page and Brin tested their thesis as part of their studies, and laid the foundation for their search engine. By early 1997, the backrub page described the state as follows:
Convinced that the pages with the most links to them from other highly relevant Web pages must be the most relevant pages associated with the search, Page and Brin tested their thesis as part of their studies, and laid the foundation for their search engine. By early 1997, the backrub page described the state as follows:
The meaning of Sharjeel
Sharjeel has multiple meanings and origins. The Indian version of Sharjeel means Unknown. The Syrian version of Sharjeel means Sharjeel was a commander of the roman army. The Islamic version of Sharjeel means Fine.
applications software
Includes programs that do real work for users. For example, word processors, spreadsheets, and database management systems fall under the category of applications software.
systems software
Includes the operating system and all the utilities that enable the computer to function.
software
Computer instructions or data. Anything that can be stored electronically is software. The storage devices and display devices are hardware.
The terms software and hardware are used as both nouns and adjectives. For example, you can say: "The problem lies in the software," meaning that there is a problem with the program or data, not with the computer itself. You can also say: "It's a software problem."
The distinction between software and hardware is sometimes confusing because they are so integrally linked. Clearly, when you purchase a program, you are buying software. But to buy the software, you need to buy the disk (hardware) on which the software is recorded.
Software is often divided into two categories:
The terms software and hardware are used as both nouns and adjectives. For example, you can say: "The problem lies in the software," meaning that there is a problem with the program or data, not with the computer itself. You can also say: "It's a software problem."
The distinction between software and hardware is sometimes confusing because they are so integrally linked. Clearly, when you purchase a program, you are buying software. But to buy the software, you need to buy the disk (hardware) on which the software is recorded.
Software is often divided into two categories:
Data structure
In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.[1][2]
Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks. For example, B-trees are particularly well-suited for implementation of databases, while compiler implementations usually use hash tables to look up identifiers.
Data structures are used in almost every program or software system. Specific data structures are essential ingredients of many efficient algorithms, and make possible the management of huge amounts of data, such as large databases and internet indexing services. Some formal design methods and programming languages emphasize data structures, rather than algorithms, as the key organizing factor in software design.
Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks. For example, B-trees are particularly well-suited for implementation of databases, while compiler implementations usually use hash tables to look up identifiers.
Data structures are used in almost every program or software system. Specific data structures are essential ingredients of many efficient algorithms, and make possible the management of huge amounts of data, such as large databases and internet indexing services. Some formal design methods and programming languages emphasize data structures, rather than algorithms, as the key organizing factor in software design.
Saturday, October 31, 2009
Industrial Strength Sw
Users and developers are separate entities
Business is dependent on it
Bugs CANNOT be tolerated
User friendly User Interface
Documentation (for usage, maintenance, and further upgrading etc)
Reliability, Robustness, Portability etc very Important
Heavy investment
High Functionality
…
Productivity of 10-100 LOC / month per developer
Business is dependent on it
Bugs CANNOT be tolerated
User friendly User Interface
Documentation (for usage, maintenance, and further upgrading etc)
Reliability, Robustness, Portability etc very Important
Heavy investment
High Functionality
…
Productivity of 10-100 LOC / month per developer
Students’ Projects
Novice developers/users
Classroom or Small project
User Interface is not important, as it is to be used by him/herself
Documentation and testing not much cared – contains bugs
Reliability, Robustness, Portability are not important
No investment
Productivity of 2.5 to 5 KLOC/month per developer
Classroom or Small project
User Interface is not important, as it is to be used by him/herself
Documentation and testing not much cared – contains bugs
Reliability, Robustness, Portability are not important
No investment
Productivity of 2.5 to 5 KLOC/month per developer
Thursday, October 29, 2009
Definition of E-Commerce:
Electronic Commerce means buying and selling of goods and services across the internet. An e-commerce site can be as simple as a catalog page with a phone no, or it can range all the way to a real time credit and processing site where customer can purchase downloadable goods and receive them on the spot.
In the face of new market system forces create due to globalization and mounting competition, we can no longer follow historical paths and seek status quo. Companies are discovering old solution do not work with new business problems. The business has changed and so have the risk and payoffs.
Electronic commerce is more than just buying and selling products online. Instead, it encompasses the entire online process of developing, marketing, selling, delivering, servicing, and paying for products and services purchased by internetworked, global marketplaces of customers, with the support of a worldwide network of business partners.
Electronic commerce systems rely on the resources of the Internet, intranets, extranets, and other computer networks. Electronic commerce can include:
• Interactive marketing, ordering, payment, and customer support processes at e-commerce sites on the World Wide Web
• Extranet access of inventory databases by customers and suppliers
• Intranet access of customer relationship management systems by sales and customer service reps
• Customer collaboration in product development via Internet newsgroups and e-mail exchanges.
In the face of new market system forces create due to globalization and mounting competition, we can no longer follow historical paths and seek status quo. Companies are discovering old solution do not work with new business problems. The business has changed and so have the risk and payoffs.
Electronic commerce is more than just buying and selling products online. Instead, it encompasses the entire online process of developing, marketing, selling, delivering, servicing, and paying for products and services purchased by internetworked, global marketplaces of customers, with the support of a worldwide network of business partners.
Electronic commerce systems rely on the resources of the Internet, intranets, extranets, and other computer networks. Electronic commerce can include:
• Interactive marketing, ordering, payment, and customer support processes at e-commerce sites on the World Wide Web
• Extranet access of inventory databases by customers and suppliers
• Intranet access of customer relationship management systems by sales and customer service reps
• Customer collaboration in product development via Internet newsgroups and e-mail exchanges.
Technologies necessary for E-Commerce:
Technologies that are necessary for electronic commerce include:
• Information technologies
• Telecommunications technologies
• Internet technologies
• Information technologies
• Telecommunications technologies
• Internet technologies
INTRODUCTION OF E-COMMERCE
Electronic commerce encompasses the entire online process of developing, marketing, selling, delivering, servicing, and paying for products and services. The Internet and related technologies and e-commerce websites on the World Wide Web and corporate intranets and extranets serve as the business and technology platform for E-commerce marketplaces for consumers and businesses in the basic categories of business-to-consumer (B2C), (C2C) e-commerce. The essential processes that should be implemented in all e-commerce applications – across control and security, personalizing and profiling, search management, content management, catalogue management, payment systems, workflow management, event notification, and collaboration and trading – are summarized in Figure 7. 5.
E-Business is the creation of new, and the redesigning of existing value chains and business processes through the application of information technology. Naturally, e-Business is more than e-commerce. It expands the scope of e-commerce to transform the company and the industry itself.
Electronic Commerce or e-commerce is the trade of products and services by means of the Internet or other computer networks. E-commerce follows the same basic principles as traditional commerce that is, buyers and sellers come together to swap commodities for money. But rather than conducting business in the traditional way in shopping stores or through mail order catalogs and telephone operators — in e-commerce buyers and sellers transact business over networked computers.
E-commerce offers buyers maximum convenience. They can visit the web sites of multiple vendors round the clock a day to compare prices and make purchases, without having to leave their homes or offices from around the globe. In some cases, consumers can immediately obtain a product or service, such as an electronic book, a music file, or computer software, by downloading it over the Internet.
For sellers, e-commerce offers a way to cut costs and expand their markets. They do not need to build, staff, or maintain a physical store or print and distribute mail order catalogs. Automated order tracking and billing systems cut additional labor costs, and if the product or service can be downloaded then e-commerce firms have no distribution costs involved. Because the products can be sold sell over the global Internet, sellers have the potential to market their products or services globally and are not limited by the physical location of a store. Internet technologies also permit sellers to track the interests and preferences of their customers with the customer’s permission and then use this information to build an ongoing relationship with the customer by customizing products and services to meet the customer’s needs.
E-Commerce is about setting your business on the Internet, allowing visitors to access your website, and go through a virtual catalog of your products / services online. When a visitor wants to buy something he/she likes, they merely, "add" it to their virtual shopping basket. Items in the virtual shopping basket can be added or deleted, and when you're all set to checkout...you head to the virtual checkout counter, which has your complete total, and will ask you for your name, address etc. and method of payment (usually via credit card). Once you have entered all this information (which by the way is being transmitted securely) you can then just wait for delivery. It’s that simple. According to a CNN Opinion Poll, 62% of respondents who were surveyed said they plan to shop online during the Christmas season. Newsweek devoted its front page story to "shopping.com" in its December 7, 1998 issue (Asian Edition). The title was "Why Online Stores are the Best Thing since Santa Claus".S
E-Commerce is not about just online stores, it’s about anything and everything to do with money. If you pay (via cash, check, credit card, etc.) E-Commerce is about to make an introduction into your life soon. Banks like Bank of America and Wells Fargo are now giving their clients accessibility to their bank accounts via the web. Soon enough, banks in Pakistan would be following suit. Days are not far away (yes in Pakistan!) when you would be able to order and reserve your request for a movie at the local video store (all online) be able to browse through various titles, etc. and if you are feeling hungry
E-Business is the creation of new, and the redesigning of existing value chains and business processes through the application of information technology. Naturally, e-Business is more than e-commerce. It expands the scope of e-commerce to transform the company and the industry itself.
Electronic Commerce or e-commerce is the trade of products and services by means of the Internet or other computer networks. E-commerce follows the same basic principles as traditional commerce that is, buyers and sellers come together to swap commodities for money. But rather than conducting business in the traditional way in shopping stores or through mail order catalogs and telephone operators — in e-commerce buyers and sellers transact business over networked computers.
E-commerce offers buyers maximum convenience. They can visit the web sites of multiple vendors round the clock a day to compare prices and make purchases, without having to leave their homes or offices from around the globe. In some cases, consumers can immediately obtain a product or service, such as an electronic book, a music file, or computer software, by downloading it over the Internet.
For sellers, e-commerce offers a way to cut costs and expand their markets. They do not need to build, staff, or maintain a physical store or print and distribute mail order catalogs. Automated order tracking and billing systems cut additional labor costs, and if the product or service can be downloaded then e-commerce firms have no distribution costs involved. Because the products can be sold sell over the global Internet, sellers have the potential to market their products or services globally and are not limited by the physical location of a store. Internet technologies also permit sellers to track the interests and preferences of their customers with the customer’s permission and then use this information to build an ongoing relationship with the customer by customizing products and services to meet the customer’s needs.
E-Commerce is about setting your business on the Internet, allowing visitors to access your website, and go through a virtual catalog of your products / services online. When a visitor wants to buy something he/she likes, they merely, "add" it to their virtual shopping basket. Items in the virtual shopping basket can be added or deleted, and when you're all set to checkout...you head to the virtual checkout counter, which has your complete total, and will ask you for your name, address etc. and method of payment (usually via credit card). Once you have entered all this information (which by the way is being transmitted securely) you can then just wait for delivery. It’s that simple. According to a CNN Opinion Poll, 62% of respondents who were surveyed said they plan to shop online during the Christmas season. Newsweek devoted its front page story to "shopping.com" in its December 7, 1998 issue (Asian Edition). The title was "Why Online Stores are the Best Thing since Santa Claus".S
E-Commerce is not about just online stores, it’s about anything and everything to do with money. If you pay (via cash, check, credit card, etc.) E-Commerce is about to make an introduction into your life soon. Banks like Bank of America and Wells Fargo are now giving their clients accessibility to their bank accounts via the web. Soon enough, banks in Pakistan would be following suit. Days are not far away (yes in Pakistan!) when you would be able to order and reserve your request for a movie at the local video store (all online) be able to browse through various titles, etc. and if you are feeling hungry
Wednesday, October 28, 2009
Implementation of Stacks in Memory
We present a linked list method using dynamic memory allocation and pointers. The two structures needed to implement stacks are head structure and data node structure. We name them STACK and NODE, respectively. In C, they are defined as follows.
struct node
{
int data;
struct node *next;
};
struct stack
{
int count;
struct node *top;
}stack1;
These declarations will reserve required memory for the two structures, but no values are assigned to the members of the both structures. The following algorithm will initialize, that is, will assign values, the stack to an empty state, which can further be expanded or shrunk as the need arises. Situation before execution of algorithm,
struct node
{
int data;
struct node *next;
};
struct stack
{
int count;
struct node *top;
}stack1;
These declarations will reserve required memory for the two structures, but no values are assigned to the members of the both structures. The following algorithm will initialize, that is, will assign values, the stack to an empty state, which can further be expanded or shrunk as the need arises. Situation before execution of algorithm,
Stack data Node
The rest of the data structure is a typical linked list data node. Although the application determines the data that are stored in the stack, the stack data node looks like any linked list node. In addition to the data, it contains a next pointer to other data nodes, making it a self-referential data structure
Stack Head
Generally the head for a stack requires only two attributes: a top pointer and a count of the number of elements in the stack. These two elements are placed in a head structure. Other stack attributes can be placed here also. For example, it is possible to record the time the stack was created and the total number of items that have ever been placed in the stack. These two metadata items would allow the user to determine the average number of items processed through the stack in given period. A basic head structure
Data Structure
To implement the linked list stack, we need two different structures, a head and a data node. The head structure contains metadata and a pointer to the top of the stack. The data node contains data and a next pointer to the next node in the stack. The stack conceptual and physical implementations
Stack Top
The third stack operation is stack top. Stack top copies the item at the top of the stack; that is, it returns the data in the top element to the user but does not delete it. You might think of this operation as reading the stack top. Stack top can also result in underflow if the stack is empty. The stack top operation
Pop
When we pop a stack, we remove the item at the top of the stack and return it to the user. Because we have removed the top item, the next item in the stack becomes the top. When the last item in the stack is deleted, the stack must be set to its empty state. If pop is called when the stack is empty, then it is in an underflow state. The stack pop operation.
Push
Push adds an item at the top of the stack. After the push, the new item becomes the top. The only potential problem with this simple operation is that we must ensure that there is room for the new item. If there is not enough room, then the stack is in an overflow state and the item cannot be added. Figure 2 shows the push stack operation.
BASIC STACK OPERATIONS
The three basic stack operations are push, pop, and stack top. Push is used to insert data into the stack. Pop removes data from a stack and returns the data to the calling module. Stack top returns the data at the top of the stack without deleting the data from the stack.
Stacks
A stack is an ordered dynamic linear list in which all additions of new elements and deletions of existing elements are restricted to one end, called the Top. If a data series is inserted into a stack, and then removed it, the order of the data would be reversed. This reversing attribute due to addition and removal at the top of stack give a special behaviour, called as “Last-in, first-out” (LIFO) structure.
A stack is dynamic because it is constantly changing
objects, it expands and shrinks with passage of time.
The basic Stack operations are Create Stack, Push,
Pop, Stacktop, Emptystack, Full stack, Stack Counts,
And Destroy stacks.
We shall also study the following
application of stacks, reversing data (convert decimal to
binary) parsing data (matching of parentheses in source programs), postponing data usage (Infix, Postfix, Prefix and evaluation of Postfix expressions.) Stack frames concept is studied in a separate chapter on recursion.
A stack is dynamic because it is constantly changing
objects, it expands and shrinks with passage of time.
The basic Stack operations are Create Stack, Push,
Pop, Stacktop, Emptystack, Full stack, Stack Counts,
And Destroy stacks.
We shall also study the following
application of stacks, reversing data (convert decimal to
binary) parsing data (matching of parentheses in source programs), postponing data usage (Infix, Postfix, Prefix and evaluation of Postfix expressions.) Stack frames concept is studied in a separate chapter on recursion.
Yardstick for Complexity Measurement
We need to devise a way of characterizing essential performance properties of an algorithm. Hard performance measures, such as wall clock time, vary significantly when we use different computers, compilers and programming languages for expressing and running the same algorithm. For reasoning, we use the example of INSERTION-SORT algorithm as a case for study.
Complexity of Algorithms
The soul of analysis of algorithms is to determine their complexity. Knowledge of complexity helps in deciding the implementation and choice of algorithms to solve computation problems efficiently.
The complexity of an algorithm A is the function f(n) or T(n) which gives the running time and/or storage space requirement of the algorithms in terms of the size n of the input data. Frequently the storage space required by an algorithm is simply a multiple of the data size n. Accordingly, the term complexity shall refer to the running time of the algorithm only, i.e by a function like T (n).
The complexity of an algorithm A is the function f(n) or T(n) which gives the running time and/or storage space requirement of the algorithms in terms of the size n of the input data. Frequently the storage space required by an algorithm is simply a multiple of the data size n. Accordingly, the term complexity shall refer to the running time of the algorithm only, i.e by a function like T (n).
Analysis of Insert Sort
As a case for study, let us calculate running time for the algorithm INSERTION-SORT. We let tj be the number of times the while loop test at line 1.3 is executed, for that value of j. We also assume that comments are not executable statements, and so they take no time.
Running Time
The running time of an algorithm on a particular input is the number of primitive operations or steps executed. For defining the notion of step, RAM model viewpoint will be kept. In actual practice, a constant amount of time is required to execute each line of pseudo code. One line may take a different amount of time than another line. Let us assume each execution of the ith line takes time ci, where ci is a constant.
Input Size
The problem input size is usually denoted by n, which is usually the number of items handled by the algorithm. For example, for sorting and searching algorithms, the array size n will be input size. Sometimes it is more appropriate to describe the size of the input with two numbers rather than one. For instance, if the input to an algorithm is a graph, the input size can be described by the number of vertices and edges in the graph. In analyzing algorithms, input size of the problem will be indicated.
Input Size and Running Time
In general, the time and or memory taken by an algorithm grows with the size of the input, so it is traditional to describe the running time of a program as a function of the size of its input. To do so, we need to define the terms “running time” and “size of input”. Immediate goal of analysis is to find a means of expression or a measuring yardstick that helps characterize the time or space requirements of running algorithm, and suppresses tedious details. This leads to a well-known concept of measuring complexity of an algorithm. The complexity of an algorithm is the function T(n) which gives the running time and/or storage space requirement of the algorithm in terms of the size n of the input data. Frequently, the storage space required by an algorithm is simply a multiple of the data size n. Accordingly; the term complexity shall refer to the running time of the algorithm only.
Random Access Machine (RAM) model of computation Technology
Knowing that our algorithms will be implemented as computer programs, we shall assume using a generic one-processor RAM. In the RAM model, instructions are executed one after another, with no concurrent operations. Models based on parallel processors will not be used.
Type of Data Structures
General morphology of data structures is shown in figure 3.
Morphology of Data Structures
There are two aspects of managing data structures, namely, logical and physical.
A. Logical Data Structures
- Linear Structures
The most common organization for data is a linear structure. A structure is linear if it has these two properties :
Property 1: Each element of the structure is followed by at most one other element
Property 2: No two elements are followed by the same element
An array is an example of a linearly structured data type. We generally write a linearly structured data type like this ABCD.
Counter example 1: If property 1 violates, then BAC, A points to two elements. This example is tree which is non-linear structure. Trees are acyclic structures.
Counter example 2 : If property 2 violates, then ACB, A and B both points to C. This is graph which is again non-linear structure.
Other common linear structure like linked lists, stack & queue are shown above in the picture depicting morphology of the Data Structures.
Non-Linear Structures
In a non-linear structure there is no limit on the number of predecessors or successors of an element of the structure. An element may have a number of successors or predecessor. These structures violate both properties of linear structure. Trees and graphs are the important examples.
A tree is a collection of nodes, where every node has a unique predecessor, but it can have many successors. A Graph is also a collection of nodes. In graph, any node can have any number of successors and predecessors, as shown in figure.
Morphology of Data Structures
There are two aspects of managing data structures, namely, logical and physical.
A. Logical Data Structures
- Linear Structures
The most common organization for data is a linear structure. A structure is linear if it has these two properties :
Property 1: Each element of the structure is followed by at most one other element
Property 2: No two elements are followed by the same element
An array is an example of a linearly structured data type. We generally write a linearly structured data type like this ABCD.
Counter example 1: If property 1 violates, then BAC, A points to two elements. This example is tree which is non-linear structure. Trees are acyclic structures.
Counter example 2 : If property 2 violates, then ACB, A and B both points to C. This is graph which is again non-linear structure.
Other common linear structure like linked lists, stack & queue are shown above in the picture depicting morphology of the Data Structures.
Non-Linear Structures
In a non-linear structure there is no limit on the number of predecessors or successors of an element of the structure. An element may have a number of successors or predecessor. These structures violate both properties of linear structure. Trees and graphs are the important examples.
A tree is a collection of nodes, where every node has a unique predecessor, but it can have many successors. A Graph is also a collection of nodes. In graph, any node can have any number of successors and predecessors, as shown in figure.
Data Structure
A Structure, usually in computer memory, used to organize data and information for better algorithm efficiency is called Data structure. Examples are array, linked-lists, queue, stack and tree.
In the design of computer programs, the choice of data structure is a primary design consideration. The quality of final results depends heavily on choosing the best data structure. The choice of appropriate data structure is crucial and has given rise to many design methods and programming languages. Object oriented languages such as C++ and Java are one group of languages that exhibit this philosophy, and have several in-built structures.
A data structure can also be defined as an aggregation of atomic and structured data types into a set with defined relationships. Structure means a set of rules that hold the data together. In other words, if we take a combination of data types and fit them into a structure such that we can define its relating rules, we have made a data structure. Data structures can be nested. We can have a data structure that consists of other data structures. For example, we can define the two structures array and record as shown in table below.
Array Record
Homogenous Sequence of
data or data types known
as elements Heterogeneous combination of
data into a single structure with
an identified key
Position association among the
elements No association
Usual operations performed on data structures are
1. Insertion, 2. Deletion, 3. Searching, 4. Sorting, 5. retrieval, 6. traversing 7. Merging, and many other operations as required by applications.
In the design of computer programs, the choice of data structure is a primary design consideration. The quality of final results depends heavily on choosing the best data structure. The choice of appropriate data structure is crucial and has given rise to many design methods and programming languages. Object oriented languages such as C++ and Java are one group of languages that exhibit this philosophy, and have several in-built structures.
A data structure can also be defined as an aggregation of atomic and structured data types into a set with defined relationships. Structure means a set of rules that hold the data together. In other words, if we take a combination of data types and fit them into a structure such that we can define its relating rules, we have made a data structure. Data structures can be nested. We can have a data structure that consists of other data structures. For example, we can define the two structures array and record as shown in table below.
Array Record
Homogenous Sequence of
data or data types known
as elements Heterogeneous combination of
data into a single structure with
an identified key
Position association among the
elements No association
Usual operations performed on data structures are
1. Insertion, 2. Deletion, 3. Searching, 4. Sorting, 5. retrieval, 6. traversing 7. Merging, and many other operations as required by applications.
Data Types
There are two types of data, namely, atomic and structured (also called composite) data.
Atomic data are data that we choose to consider as a single, non-decomposable entity. Boolean or logical data are examples. The integer 4562 may be considered as a single integer value. We can decompose it into single digits (4,5,6,2) but decomposed digits will not have the same characteristics of the original integer.
Atomic data are data that we choose to consider as a single, non-decomposable entity. Boolean or logical data are examples. The integer 4562 may be considered as a single integer value. We can decompose it into single digits (4,5,6,2) but decomposed digits will not have the same characteristics of the original integer.
Data and Information
Data is collected about an entity which can be an object or event (real or abstract) of interest to the user. An entity may be a person, place, location or thing- e.g., a salesperson, a city or a product. An entity can also be an event or unit of time such as machine break-down, a sale, or a month.
Data remains raw facts in isolation unless they are manipulated to be useful. These isolated facts do convey meaning but generally are not useful by themselves.
For example, consider a data base which stores data about a BEIT class. Data records contains students names, date of birth, address, courses, and grades on each course. If the Principal calls to find out the total number of students enrolled in BEIT class, the clerk can answer his question by looking at the data base. To the clerk, data base is information. But director of institute wants to know the total number of students in data structures class completing with A grade. The director will have to identify the students in class data base and then identify students with A grades. The information given to the principal is data to the director. It produced useful information for the director when it was processed further to output the list of only those students who have completed data structure course with A grades. This example tells that one person’s information may be another person’s data. In this sense, information eliminates uncertainty about a state, while data mean accumulated but unorganized facts.
One can say that data is to information is same as ore is to gold. A computer-based information system creates, collects and stores data, and processes that data into useful information. Symbolically,
Information = f (data, processing), a function of data and processing.
Data is, therefore, an essential ingredient for generation of information.
Data remains raw facts in isolation unless they are manipulated to be useful. These isolated facts do convey meaning but generally are not useful by themselves.
For example, consider a data base which stores data about a BEIT class. Data records contains students names, date of birth, address, courses, and grades on each course. If the Principal calls to find out the total number of students enrolled in BEIT class, the clerk can answer his question by looking at the data base. To the clerk, data base is information. But director of institute wants to know the total number of students in data structures class completing with A grade. The director will have to identify the students in class data base and then identify students with A grades. The information given to the principal is data to the director. It produced useful information for the director when it was processed further to output the list of only those students who have completed data structure course with A grades. This example tells that one person’s information may be another person’s data. In this sense, information eliminates uncertainty about a state, while data mean accumulated but unorganized facts.
One can say that data is to information is same as ore is to gold. A computer-based information system creates, collects and stores data, and processes that data into useful information. Symbolically,
Information = f (data, processing), a function of data and processing.
Data is, therefore, an essential ingredient for generation of information.
Paths, Cycles, and Adjacency :
Two vertices vi and vj in a graph G = (V,E) are adjacent or neighbours if there exists an edge e E such that e=( vi,vj). A path p in a graph G=(V,E) is a sequence of vertices of V of the form, p = v1, v2………vn, (n 2) in which each vertex vi is adjacent to the next one,vi+1 (for 1 . The path is said to be simple if all the vertices or vertices are distinct, with the exception that v1 may equal vn. A path will be of length n if consists of a sequence of n+1 vertices. In other words, length of a path is the number of edges in it.
A cycle is a path p = v1, v2………vn, such that v1 = v2 (so that p starts and ends at the same vertex and forms a loop). Figure 2 illustrates paths and cycles in a graph. A cycle is a path of length greater than one that begins and ends at the same vertex. In an undirected graph, a simple cycle is a path that travels through three or more distinct vertices and connects them into a loop. Formally speaking, this means that if p is a path of the form, p = v1, v2………vn then p is a simple cycle if and only if (n>=3), v1 = vn, and vi vj for distinct i and j in the range . Put differently, when you travel around the loop in a simple cycle, you must visit at least three different vertices, and you cannot travel through any vertex more than once.
A cycle is a path p = v1, v2………vn, such that v1 = v2 (so that p starts and ends at the same vertex and forms a loop). Figure 2 illustrates paths and cycles in a graph. A cycle is a path of length greater than one that begins and ends at the same vertex. In an undirected graph, a simple cycle is a path that travels through three or more distinct vertices and connects them into a loop. Formally speaking, this means that if p is a path of the form, p = v1, v2………vn then p is a simple cycle if and only if (n>=3), v1 = vn, and vi vj for distinct i and j in the range . Put differently, when you travel around the loop in a simple cycle, you must visit at least three different vertices, and you cannot travel through any vertex more than once.
Graphs and Multigraphs:
A graph represented by G (V,E) consists of two things :
(1) A set V of elements called vertices (or points or nodes)
(2) A set E of edges such that each edge eiE is identified with a unique (unordered) pair [vi , vj] of vertices in V, denoted by ei=[ vi , vj]
The above graph consists of 5 vertices and edges
Suppose ei or ej=[ vi , vj]. Then the vertices vi and vj are called endpoints of e, and vi and vj are said to be adjacent vertices or neighbors. The degree of a vertex v, written deg (v), is the number of edges containing v. If deg(v)=0-that is, if v does not belong to any edge-then v is called an isolated vertex.
(1) A set V of elements called vertices (or points or nodes)
(2) A set E of edges such that each edge eiE is identified with a unique (unordered) pair [vi , vj] of vertices in V, denoted by ei=[ vi , vj]
The above graph consists of 5 vertices and edges
Suppose ei or ej=[ vi , vj]. Then the vertices vi and vj are called endpoints of e, and vi and vj are said to be adjacent vertices or neighbors. The degree of a vertex v, written deg (v), is the number of edges containing v. If deg(v)=0-that is, if v does not belong to any edge-then v is called an isolated vertex.
Separate Chaining Resolution Using Linked List
A major disadvantage to open addressing is that each collision resolution increases the probability of future collisions. This disadvantage is eliminated in the second approach to collision resolution known as separate chaining using short linked lists. Separate chaining is an ordered collection of data in which each element contains the location of the next synonymous element. For example, in Figure 12, array element 001, Sarah Trapp, contains a pointer to the next element, Harry Eagle, which in turn contains a pointer to the third element, Chirs Walljasper. Linked list resolution uses a separate area to store collisions and chains all synonyms together in a linked list. It uses two storage areas, the prime area and the overflow area. Each element in the prime area contains an additional field, a link head pointer to a linked list of overflow data in the overflow area. When a collision occurs, one element is stored in the prime area and chained to its corresponding linked list in the overflow area. Although the overflow area can be any data structure, it is typically implemented as a linked list in dynamic memory. Figure 12 Shows the linked list from Figure 11 with three synonyms for address 001and three synonyms for address 007.
The linked list data can be stored in any order, but a last in-first out (LIFO) stack sequence and a key sequence are the most common. The LIFO sequence is the fastest for inserts because the linked list does not have to be scanned to insert the data. The element being inserted into overflow is simply placed at the beginning of the linked list and linked to the node in the prime area. Key sequenced lists, with the key in the prime area being the smallest, provide for faster search retrieval. Which sequence (LIFO or key-sequence) is used depends on the application.
The linked list data can be stored in any order, but a last in-first out (LIFO) stack sequence and a key sequence are the most common. The LIFO sequence is the fastest for inserts because the linked list does not have to be scanned to insert the data. The element being inserted into overflow is simply placed at the beginning of the linked list and linked to the node in the prime area. Key sequenced lists, with the key in the prime area being the smallest, provide for faster search retrieval. Which sequence (LIFO or key-sequence) is used depends on the application.
Key Offset
Key offset is a double hashing method that produces different collision paths for different keys. Whereas the pseudorandom number generator produces a new address as a function of the previous address, key offset calculates the new address as a function of the old address and the key. One of the simplest versions simply adds the quotient of the key divided by the list size to the address to determine the next collision resolution address, as shown in the formula below.
Offset = key/listSize
address = ((offset + old address) modulo listSize)
For example, when the key is 166703 and the list size is 307, using the modulo-division hashing method we generate an address of 1. As shown in Figure 11, this synonym of 070919 produces a collision at address 1. Using key offset to calculate the next address, we get 237, as shown below.
Offset = 166703/307 = 543
address = ((543 + 001) modulo 307)= 237
If 237 were also a collision, we would repeat the process to locate the next address, as shown below.
Offset = 166703/307 = 543
address = ((543 + 237) modulo 307)= 166
To really see the effect of key offset, we need to calculate several different keys, all hashing to the same home address. In Table 2. We calculate the next two collision probe addresses for three keys that collide at address 001.
Offset = key/listSize
address = ((offset + old address) modulo listSize)
For example, when the key is 166703 and the list size is 307, using the modulo-division hashing method we generate an address of 1. As shown in Figure 11, this synonym of 070919 produces a collision at address 1. Using key offset to calculate the next address, we get 237, as shown below.
Offset = 166703/307 = 543
address = ((543 + 001) modulo 307)= 237
If 237 were also a collision, we would repeat the process to locate the next address, as shown below.
Offset = 166703/307 = 543
address = ((543 + 237) modulo 307)= 166
To really see the effect of key offset, we need to calculate several different keys, all hashing to the same home address. In Table 2. We calculate the next two collision probe addresses for three keys that collide at address 001.
Pseudorandom (MAD) Collision Resolution
This method uses a pseudorandom number to resolve the collision. We saw the pseudorandom number generator as a hashing method in “Pseudorandom Method”. We now use it as a collision resolution method. In this case, rather than using the key as a factor in the random number calculation, we use the collision address. Consider the collision we created in Figure 10. We now resolve the collision using the following pseudorandom number generator, where a is 3 and c is 5 :
y = (ax + c) modulo listSize
= (3 x 1 + 5) Modulo 307
= 8, next addresses are 29, 92, and so on
In this example, we resolve the collision by placing the new data in element 008 (Figure 11). Pseudorandom numbers are a relatively simple solution, but they have one significant limitation: All keys follow only one collision resolution path through the list. (This deficiency also occurs in the linear and quadratic probes.) Because pseudorandom collision resolution can create significant secondary
y = (ax + c) modulo listSize
= (3 x 1 + 5) Modulo 307
= 8, next addresses are 29, 92, and so on
In this example, we resolve the collision by placing the new data in element 008 (Figure 11). Pseudorandom numbers are a relatively simple solution, but they have one significant limitation: All keys follow only one collision resolution path through the list. (This deficiency also occurs in the linear and quadratic probes.) Because pseudorandom collision resolution can create significant secondary
Double Hashing or Rehashing
These are methods of dealing with hash collisions. Double hashing or rehashing involves using a secondary hash function on the hashed key of the item. The rehash function is applied successively until an empty position is found. If the hash position is found occupied during a search, the rehash function is again used to locate the item. In general, a rehash function, RH, accepts an array address and produces another. If array address H (K) is already occupied, a rehash function, RH, is applied to the value of H (K) i.e.
RH (H (K))
to find another location. If position RH (H (K) ) is also occupied, it too is rehashed to see if RH (RH (H (K))) is available. This process continues until an empty location is found. RH is not necessarily the same as the original hash function.
The following two methods are collectively known as double hashing or rehashing methods. In each method, rather than using an arithmetic probe function, the address is rehashed. Both methods prevent primary clustering.
RH (H (K))
to find another location. If position RH (H (K) ) is also occupied, it too is rehashed to see if RH (RH (H (K))) is available. This process continues until an empty location is found. RH is not necessarily the same as the original hash function.
The following two methods are collectively known as double hashing or rehashing methods. In each method, rather than using an arithmetic probe function, the address is rehashed. Both methods prevent primary clustering.
Quadratic Probe
Primary clustering can be eliminated by adding a value other than 1 to the current address. One easily implemented method is to use the quadratic probe. In the quadratic probe, the increment is the collision probe number squared. Thus for the first probe we add 12, for the second collision probe we add 22, for the third collision probe we add 32, and so forth until we either find an empty element or we exhaust the possible elements. To ensure that we don’t run off the end of the address list, we use the modulo of the quadratic sum for the new address. This sequence is shown in Table 1, which for simplicity assumes a collision at location 1 and a list size of 100. From table, we can see quadratic probing causes secondary clustering on a collision resolution path.
Address = H(K)+C i2
Where we take C=1 and i is collision probe number.
Address = H(K)+C i2
Where we take C=1 and i is collision probe number.
Linear Probe
Our first collision resolution method is also the simplest. In a linear probe, when data cannot be stored in the home address, we resolve the collision by adding 1 to the current address. For example, let’s add two more elements to the modulo-division method example. The results are shown in figure 10. When we insert key 070919, we find an empty element and insert it with no collision. When we try to insert key 166703, however, we have a collision at location 001. We try to resolve the collision by adding 1 to the address and inserting the new data at location 002. However, this address is also filled. We therefore add another 1 to the address and this time find an empty location, 003, where we can place the new data.
As an alternative to a simple linear probe, we can add 1, subtract 2, add 3, subtract 4, and so forth until we locate an empty element. In either method, the code for the linear probe must ensure that the next collision resolution address lie
As an alternative to a simple linear probe, we can add 1, subtract 2, add 3, subtract 4, and so forth until we locate an empty element. In either method, the code for the linear probe must ensure that the next collision resolution address lie
Open Addressing
The first collision resolution approach, open addressing, resolves collisions in the prime area that is, the area that contains all of the home addresses. This technique is opposed to linked list resolution, in which the collisions are resolved by placing the data in a separate overflow area.
When a collision occurs, the prime area addresses are searched for an open or unoccupied element where the new data can be placed. We discuss four different methods: linear probe, quadratic probe, pseudorandom collision resolution, and key offset. The last two methods, pseudorandom and key offset, are collectively known as double hashing or rehashing methods
When a collision occurs, the prime area addresses are searched for an open or unoccupied element where the new data can be placed. We discuss four different methods: linear probe, quadratic probe, pseudorandom collision resolution, and key offset. The last two methods, pseudorandom and key offset, are collectively known as double hashing or rehashing methods
Clustering
As data are added to a list and collisions are resolved, some hashing functions tend to cause data to group within the list. This tendency of data to build up unevenly across a hashed list is known as clustering. Clustering is a concern because it is usually created by collisions. If the table contains a high degree of clustering, then the number of probes to locate an element grows and reduces the processing efficiency of the table.
Primary clustering, occurs when data cluster around a home address. A cluster is a sequence of adjacent occupied entries in a hash table. Clusters have no empty keys in them, and consists of contiguous runs of occupied entries. A collision resolution method, linear probing, is subject to something called Primary Clustering. When a number of keys collide at a given location, and we use linear probing to resolve, the colliding keys are inserted into empty locations immediately adjacent to the collision location. This can cause a puddle of keys to form at the collision location, called Primary Clustering.
Therefore, we need to design our hashing functions to minimize clustering. However, note that with the exception of the direct method, we cannot eliminate collisions. If we have a list with 365 addresses, we can expect to get a collision within the first 23 inserts more than 50% of the time.
C. Our final concept is that the number of elements examined in the search for a place to store the data must be limited. The traditional limit of examining all elements of the list presents three difficulties. First, the search is not sequential, so finding the end of the list doesn’t mean that every element has been tested. Second, examining every element would be excessively time-consuming for an algorithm that has as its goal a search effort of one. Third, some of the collision resolution techniques cannot physically examine all of the elements in a list. (For an example, “Quadratic Probe,”)
Generally a collision limit is placed on hashing algorithms. What happens when the limit is reached depends on the application.
Generally, there are two different approaches to resolving collisions: open addressing and using linked list chaining.
Primary clustering, occurs when data cluster around a home address. A cluster is a sequence of adjacent occupied entries in a hash table. Clusters have no empty keys in them, and consists of contiguous runs of occupied entries. A collision resolution method, linear probing, is subject to something called Primary Clustering. When a number of keys collide at a given location, and we use linear probing to resolve, the colliding keys are inserted into empty locations immediately adjacent to the collision location. This can cause a puddle of keys to form at the collision location, called Primary Clustering.
Therefore, we need to design our hashing functions to minimize clustering. However, note that with the exception of the direct method, we cannot eliminate collisions. If we have a list with 365 addresses, we can expect to get a collision within the first 23 inserts more than 50% of the time.
C. Our final concept is that the number of elements examined in the search for a place to store the data must be limited. The traditional limit of examining all elements of the list presents three difficulties. First, the search is not sequential, so finding the end of the list doesn’t mean that every element has been tested. Second, examining every element would be excessively time-consuming for an algorithm that has as its goal a search effort of one. Third, some of the collision resolution techniques cannot physically examine all of the elements in a list. (For an example, “Quadratic Probe,”)
Generally a collision limit is placed on hashing algorithms. What happens when the limit is reached depends on the application.
Generally, there are two different approaches to resolving collisions: open addressing and using linked list chaining.
Load Factor
Before we discuss the collision resolution methods, however, we need to cover three more concepts. Because of the nature of hashing algorithms, there must be some empty elements in a list at all times. In fact, we define a full list as a list in which all elements except one contain data. As a rule of thumb, a hashed list should not be allowed to become more than 75% full. This guideline leads us to our first concept, load factor. The load factor of a hashed list is the number of elements in the list divided by the number of physical elements allocated for the list, expressed as a percentage. The formula in which k represents the number of filled elements in the list and n represents the total number of elements allocated to the list is
Collision Resolution Methods
With the exception of the direct method, none of the methods used for hashing are one-to-one mapping. Thus, when we have a new key to an address, we may create a collision. There are several methods for handling collisions, each of them independent of the hashing algorithm. That is, each hashing method can be used with each of the collision resolution methods. In this section we discuss the collision resolution methods shown in Figure 9.
Pseudorandom Method
In the pseudorandom method, the key is used as the seed in a pseudorandom number generator and the resulting random number is then scaled into the possible address range using modulo division. A common random number generator is shown below.
y = ax + c
This method is also known as MAD which stands for multiply, add and divide. To use the pseudorandom number generator as a hashing method, we set x to the key, multiply it by the coefficient a, and then add the constant c. The result is then divided by the list size with the remainder (see “Modulo-Division Method,”) being the hashed address. Let’s demonstrate the concept with an example from Figure 6. To keep the calculation reasonable, we use 17 and 7 for factors a and c, respectively. Also, the list size in the example is the prime number
y = ax + c
This method is also known as MAD which stands for multiply, add and divide. To use the pseudorandom number generator as a hashing method, we set x to the key, multiply it by the coefficient a, and then add the constant c. The result is then divided by the list size with the remainder (see “Modulo-Division Method,”) being the hashed address. Let’s demonstrate the concept with an example from Figure 6. To keep the calculation reasonable, we use 17 and 7 for factors a and c, respectively. Also, the list size in the example is the prime number
Rotation Method
Rotation hashing is generally not used by itself but rather is incorporated in combination with other hashing methods. It is most useful when keys are assigned serially, such as we often see in employee numbers and part numbers. A simple hashing algorithm tends to create synonyms when hashing keys are identical except for the last character. Rotating the last character to the front of the key minimizes this effect. For example, consider the case of a six-digit employee number that might be used in a large company.
Examine the rotated key carefully. Because all keys now end in 60010, they would obviously not work well with modulo division. One the other hand, if we used a simple fold shift hash on the original key and a two-digit address, the addresses would be sequential starting with 62. Using a shift hash on the rotated key results in the series of addresses 26, 36, 46, 56, 66, which has the desired effect of spreading the data more evenly across the address space. Rotation is often used in combination with folding hashing.
Examine the rotated key carefully. Because all keys now end in 60010, they would obviously not work well with modulo division. One the other hand, if we used a simple fold shift hash on the original key and a two-digit address, the addresses would be sequential starting with 62. Using a shift hash on the rotated key results in the series of addresses 26, 36, 46, 56, 66, which has the desired effect of spreading the data more evenly across the address space. Rotation is often used in combination with folding hashing.
Folding Methods
Two folding methods are used: fold shift and fold boundary. In fold shift, the key value is divided into parts whose size matches the size of the required address. Then the left and right parts are shifted and added with the middle part. For example, imagine we want to
Key
123456789
123
123 456 789
789 321
1 368 123 456 789
987
1 764
(a) Fold shift (b) Fold boundary
Figure 7: Hash fold examples
map identity numbers into three-digit addresses. We divide the nine-digit identity number into three three-digit numbers, which are then added. If the resulting sum is greater than 999, then we discard the leading digit. This method is shown in Figure 7 (a).
In fold boundary, the left and right numbers are folded on a fixed boundary between them and the center number. The two outside values are thus reversed, as seen in Figure 7 (b). It is interesting to note that the two folding methods give different hashed addresses.
Key
123456789
123
123 456 789
789 321
1 368 123 456 789
987
1 764
(a) Fold shift (b) Fold boundary
Figure 7: Hash fold examples
map identity numbers into three-digit addresses. We divide the nine-digit identity number into three three-digit numbers, which are then added. If the resulting sum is greater than 999, then we discard the leading digit. This method is shown in Figure 7 (a).
In fold boundary, the left and right numbers are folded on a fixed boundary between them and the center number. The two outside values are thus reversed, as seen in Figure 7 (b). It is interesting to note that the two folding methods give different hashed addresses.
Midsquare Method
In midsquare hashing, the key is squared and the address selected from the middle of the squared number. The most obvious limitation of this method is the size of the key. Given a key of 6 digits, the product will be 12 digits, which is beyond the maximum integer size of many computers. Because most personal computers can handle a 9-digits integer, let’s demonstrate the concept with keys of 4 digits. Given a key of 9452, the midsquare address calculation is shown below using a 4-digit address (0000 to 9999).
9452 * 9452 = 89340304 : address is 3403
As a variation on the midsquare method, we can select a portion of the key, such as the middle three digits, and then use them rather than the whole key. Doing so allows the method to be used when the key is too large to square. For example, for the keys in Figure 6, we can select the first three digits and then use the midsquare method as shown below. (We select the third, fourth, and fifth digits as the address.)
9452 * 9452 = 89340304 : address is 3403
As a variation on the midsquare method, we can select a portion of the key, such as the middle three digits, and then use them rather than the whole key. Doing so allows the method to be used when the key is too large to square. For example, for the keys in Figure 6, we can select the first three digits and then use the midsquare method as shown below. (We select the third, fourth, and fifth digits as the address.)
Digit – Extraction Method
Using digit extraction, selected digits are extracted from the key and used as the address. For example, using our six-digit employee number to hash to a three-digit address (000 to 999) we could select the first, third, and fourth digits (from the left) and use them as the address. Using the keys from Figure 6, we would hash them to the addresses shown below.
379452 394
121267 112
378845 388
160252 102
045128 051
379452 394
121267 112
378845 388
160252 102
045128 051
Modulo-Division Method
Also known as division remainder, the modulo-division method divides the key by array size (table size) and uses the remainder for the address. This method gives us the simple hashing algorithm shown below where tableSize is the number of elements in the array.
address = key MODULUS tableSize
This algorithm works with any table size, but a table size that is a prime number produces fewer collisions than other table sizes. We should therefore try, whenever possible, to make the array size a prime number.
As the little company begins to grow, we realize that soon we will have more than 100 employees. Planning for the future, we create a new employee numbering system that will handle 1,000,000 employees. We also decide that we want to provide data space for up to 300 employees. The first prime number greater than 300 is 307. We therefore choose 307 as out list (array) size, which gives us a table with addresses that range from 0 through 306. Our new employee table and some of its hashed addresses are shown in Figure 6.
To demonstrate, let’s hash Bryan Devaux’s employee number, 121267.
121267/307 = 395 with remainder of 2
Therefore: hash(121267) = 2
address = key MODULUS tableSize
This algorithm works with any table size, but a table size that is a prime number produces fewer collisions than other table sizes. We should therefore try, whenever possible, to make the array size a prime number.
As the little company begins to grow, we realize that soon we will have more than 100 employees. Planning for the future, we create a new employee numbering system that will handle 1,000,000 employees. We also decide that we want to provide data space for up to 300 employees. The first prime number greater than 300 is 307. We therefore choose 307 as out list (array) size, which gives us a table with addresses that range from 0 through 306. Our new employee table and some of its hashed addresses are shown in Figure 6.
To demonstrate, let’s hash Bryan Devaux’s employee number, 121267.
121267/307 = 395 with remainder of 2
Therefore: hash(121267) = 2
Direct Method
In direct hashing, the key is the address without any hashing manipulation. The structure must therefore contain an element for every possible key. The situations in which you can use direct hashing are limited, but it can be very powerful because it guarantees that there are no synonyms. Let’s look at two applications.
Now let’s take an example. Imagine that a small organization has fewer than 100 employees. Each employee is assigned an employee number between 1 and 100. In this case, if we create an array of 101 employee records, the employee number can be directly used as the address of any individual record. This concept is shown in Figure 5.
Note that not every element in the array contains an employee’s record. In fact, all hashing techniques other than direct hashing require that some of the elements be empty to reduce the number of collisions.
Although this is the ideal method, its application is very limited. For example, we cannot have the National Identity Card Number as the key using this method because National Identity Card Number is 13 digits. In other words, if we use the National Identity Card Number as the key, we need an array as large as 1,000,000,000,0000 records but we would use fewer than 100 of them. Let’s turn our attention, then to hashing techniques that map a large population of possible keys into a small address space.
Now let’s take an example. Imagine that a small organization has fewer than 100 employees. Each employee is assigned an employee number between 1 and 100. In this case, if we create an array of 101 employee records, the employee number can be directly used as the address of any individual record. This concept is shown in Figure 5.
Note that not every element in the array contains an employee’s record. In fact, all hashing techniques other than direct hashing require that some of the elements be empty to reduce the number of collisions.
Although this is the ideal method, its application is very limited. For example, we cannot have the National Identity Card Number as the key using this method because National Identity Card Number is 13 digits. In other words, if we use the National Identity Card Number as the key, we need an array as large as 1,000,000,000,0000 records but we would use fewer than 100 of them. Let’s turn our attention, then to hashing techniques that map a large population of possible keys into a small address space.
Probing and Probing Sequence
It should be obvious that when we need to locate an element in a hashed list, we must use the same algorithm that we used to insert it into the list. Consequently, we first hash the key and check the home address to determine whether it contains the desired element. If it does, the search is complete. If not, we must use the collision resolution algorithm to determine the next location and continue until we find the element or determine that it is not in the list. Each calculation of an address and test for success is known as a probe. In case of collisions, method of open addressing produces alternate lists of addresses. The process of probing produces a probing sequence which is a sequence of locations that we examine when we attempt to insert a new key into the hashed table, T. The first location in the probe sequence is the hash address, H (K). The second and successful locations in the probe sequence are determined by Collision Resolution Policy. To guarantee availability of always an empty location in every probe sequence, we define a “Full” Table T, to be a table having exactly one empty table entry.
Collisions in Hashing
Generally, the population of keys for a hashed list is greater than the storage area for the data. For example, if we have an array of 50 students for a class in which the students are identified by the last four digits of their National Identity Card Numbers, then there are 200 possible keys for each element in the array (10,000/50).
Because there are many keys for each index location in the array, more than one student may hash to the same location in the array. We call the set of keys that hash to the same location in our list synonyms.
Hash function H (K) are used to map keys into table addresses to store data as table entries (K, I) in hashed table. Almost always we use hashing techniques in which there are many more distinct keys K than there are table addresses, so we encounter a situation in which two distinct keys, K1 K2, map to the same table address, i.e.;
H (K1) = H (K2)
Because there are many keys for each index location in the array, more than one student may hash to the same location in the array. We call the set of keys that hash to the same location in our list synonyms.
Hash function H (K) are used to map keys into table addresses to store data as table entries (K, I) in hashed table. Almost always we use hashing techniques in which there are many more distinct keys K than there are table addresses, so we encounter a situation in which two distinct keys, K1 K2, map to the same table address, i.e.;
H (K1) = H (K2)
Hash Function
The idea of using the key from a large set K of keys to determine the address/location of a record into a smaller set L of table addresses leads to the form of a hash function H. A function that performs this job, such as
H (K) = L
is called a hash function or hashing function. Another way to describe hashing is as a key-to-address transformation in which the keys map to addresses in a list. This mapping transformation is shown in Figure 2. At the top of Figure 2 is a general representation of the hashing concept. The rest of the figure shows how three keys might hash to three different addresses in the list. There are various methods or functions used to map key to addresses, namely Modulo-Division Method, Midsquare Method, and Folding Method etc.
There are few criteria and purposes for hashing of keys. One is that hashing function should be very easy, simple and quick to compute. Second is to map a large possible set of keys to a smaller set of addresses in table. Usually the population of possible keys is much larger than the allocated table size. Thirdly we would like hashed addresses be distributed evenly on the smaller set so that there are a minimum number of collisions. Collision elimination is not guaranteed. Uneven distribution will cause clustering which is not desirable. Fourthly hashing allow direct access to a specific location in the table where the key of the record has been hashed. Its running time efficiency is O(1).
A hash table is a list of records in which keys are mapped by a hash function. It is similar to table, T, described in section 1.0, except here table entries T (K, I) are generated by a appropriate hash functions. A hash table generated by a perfect hash has no collisions. In this case usually all possible keys must be known before hand. This is also known as optimal hashing or perfect hashing.
H (K) = L
is called a hash function or hashing function. Another way to describe hashing is as a key-to-address transformation in which the keys map to addresses in a list. This mapping transformation is shown in Figure 2. At the top of Figure 2 is a general representation of the hashing concept. The rest of the figure shows how three keys might hash to three different addresses in the list. There are various methods or functions used to map key to addresses, namely Modulo-Division Method, Midsquare Method, and Folding Method etc.
There are few criteria and purposes for hashing of keys. One is that hashing function should be very easy, simple and quick to compute. Second is to map a large possible set of keys to a smaller set of addresses in table. Usually the population of possible keys is much larger than the allocated table size. Thirdly we would like hashed addresses be distributed evenly on the smaller set so that there are a minimum number of collisions. Collision elimination is not guaranteed. Uneven distribution will cause clustering which is not desirable. Fourthly hashing allow direct access to a specific location in the table where the key of the record has been hashed. Its running time efficiency is O(1).
A hash table is a list of records in which keys are mapped by a hash function. It is similar to table, T, described in section 1.0, except here table entries T (K, I) are generated by a appropriate hash functions. A hash table generated by a perfect hash has no collisions. In this case usually all possible keys must be known before hand. This is also known as optimal hashing or perfect hashing.
Table
A table, T, is an abstract data storage structure that contains table entries that are either empty or are pairs of the form (K, I), where K is a key and I is some data or information associated with the key K.
In a table, there is no order on the elements. A table with no order, showing
a list of students with idenfication number, name and phone is given below.
given below.
Figure 1: A Simple Table
The identity number is “key”, and name, phone etc are “information” associated with the key.
A common table operation is table searching, which is an activity in which, given a search key, K, we attempt to find the table entry (K, I) in T containing the key K. Then we may wish to retrieve or update its information, I, or we may wish to delete the entire table entry (K, I). We may also wish to insert a new table entry (K, I). We can enumerate entries in table T, e.g. to print contents of the table.
Table can be represented by various data structures like C struct, arrays and linked lists.
In a table, there is no order on the elements. A table with no order, showing
a list of students with idenfication number, name and phone is given below.
given below.
Figure 1: A Simple Table
The identity number is “key”, and name, phone etc are “information” associated with the key.
A common table operation is table searching, which is an activity in which, given a search key, K, we attempt to find the table entry (K, I) in T containing the key K. Then we may wish to retrieve or update its information, I, or we may wish to delete the entire table entry (K, I). We may also wish to insert a new table entry (K, I). We can enumerate entries in table T, e.g. to print contents of the table.
Table can be represented by various data structures like C struct, arrays and linked lists.
Thursday, October 22, 2009
Binary Trees
A binary tree T is defined as a finite set of elements, called nodes, such that :
(a) T is empty (called the NULL tree or empty tree) or
(b) T contains a distinguished node R, called the root of T, and the remaining nodes of T form an ordered pair of disjoint binary trees T1 and T2 .
A binary tree is a tree in which no node can have more than two subtrees. In other words, a node can have zero, one, or two subtrees. These subtrees are designated as the left subtree and right subtree.
(a) T is empty (called the NULL tree or empty tree) or
(b) T contains a distinguished node R, called the root of T, and the remaining nodes of T form an ordered pair of disjoint binary trees T1 and T2 .
A binary tree is a tree in which no node can have more than two subtrees. In other words, a node can have zero, one, or two subtrees. These subtrees are designated as the left subtree and right subtree.
Terminology of Trees
In addition to root, many different terms are used to describe the attributes of a tree. A leaf is any node with an outdegree of zero. A node that is not a root or a leaf is known as an internal node because it is found in the middle portion of a tree. A leaf, being a node with no successor, is also called terminal node.
A node is a parent (predecessor) if it has successor nodes – that is, if it has an outdegree greater than zero. Conversely, a node with a predecessor is a child (Successor). A child node has an indegree of one. Two or more nodes with the same parent are siblings. An ancestor is any node in the path from the root to the node. A descendent is any node in the path below the parent node; that is, all nodes in the paths from a given node to a leaf are descendents of the node.
A descendent is any node in the path below the parent node; that is, all nodes in the paths from a given node to a leaf are descendents of the node.
The level of a node is its distance from the root. Because the root has a zero distance from itself, the root is at level 0. The children of the root are at level 1, their children are level 2, and so forth.
A node is a parent (predecessor) if it has successor nodes – that is, if it has an outdegree greater than zero. Conversely, a node with a predecessor is a child (Successor). A child node has an indegree of one. Two or more nodes with the same parent are siblings. An ancestor is any node in the path from the root to the node. A descendent is any node in the path below the parent node; that is, all nodes in the paths from a given node to a leaf are descendents of the node.
A descendent is any node in the path below the parent node; that is, all nodes in the paths from a given node to a leaf are descendents of the node.
The level of a node is its distance from the root. Because the root has a zero distance from itself, the root is at level 0. The children of the root are at level 1, their children are level 2, and so forth.
Basic Tree Concepts
A tree consists of a finite set of elements, called nodes, and a finite set of directed lines, called branches, that connect the nodes. The number of branches associated with a node is the degree of the node. When the branch is directed toward the node, it is an indegree branch; when the branch is directed away from the node, it is an outdegree branch. The sum of the indegree and outdegree branches is the degree of the node.
If the tree is not empty, then the first node is called the root. The indegree of the root is, by definition, zero. With the exception of the root, all of the nodes in a tree must have an indegree of exactly one. All nodes in the tree can have zero, one, or more branches leaving them; that is, they may have an outdegree of zero, one, or more.
If the tree is not empty, then the first node is called the root. The indegree of the root is, by definition, zero. With the exception of the root, all of the nodes in a tree must have an indegree of exactly one. All nodes in the tree can have zero, one, or more branches leaving them; that is, they may have an outdegree of zero, one, or more.
Monday, October 19, 2009
Tuesday, September 8, 2009
The Beginnings of The Internet
It will help in discussing the beginnings of the Internet to define what the Internet is. Now you can get as many different definitions of what the Internet is as you can dictionaries. But for must of us, the simple description, a "worldwide system of interconnected networks and computers" is pretty good and adequate. But when people get more technical, they tend to add to the definition terms such as "a network that uses the Transmission Control Protocol - Internet protocol" (or TCP/IP).Many people have heard that the Internet began with some military computers in the Pentagon called Arpanet in 1969. The theory goes on to suggest that the network was designed to survive a nuclear attack. However, whichever definition of what the Internet is we use, neither the Pentagon nor 1969 hold up as the time and place the Internet was invented. A project which began in the Pentagon that year, called Arpanet, gave birth to the Internet protocols sometime later (during the 1970's), but 1969 was not the Internet's beginnings. Surviving a nuclear attack was not Arpanet's motivation, nor was building a global communications network. Bob Taylor, the Pentagon official who was in charge of the Pentagon's Advanced Research Projects Agency Network (or Arpanet) program, insists that the purpose was not military, but scientific. The nuclear attack theory was never part of the design. Nor was an Internet in the sense we know it part of the Pentagon's 1969 thinking. Larry Roberts, who was employed by Bob Taylor to build the Arpanet network, states that Arpanet was never intended to link people or be a communications and information facility. Arpanet was about time-sharing. Time sharing tried to make it possible for research institutions to use the processing power of other institutions computers when they had large calculations to do that required more power, or when someone else's facility might do the job better.What Arpanet did in 1969 that was important was to develop a variation of a technique called packet switching. In 1965, before Arpanet came into existence, an Englishman called Donald Davies had proposed a similar facility to Arpanet in the United Kingdom, the NPL Data Communications Network. It never got funded; but Donald Davies did develop the concept of packet switching, a means by which messages can travel from point to point across a network. Although others in the USA were working on packet switching techniques at the same time (notably Leonard Kleinrock and Paul Baran), it was the UK version that Arpanet first adopted.However, although Arpanet developed packet switching, Larry Roberts makes it clear that sending messages between people was "not an important motivation for a network of scientific computers". Its purpose was to allow people in diverse locations to utilise time on other computers. It never really worked as an idea - for a start, all the computers had different operating systems and versions and programs, and using someone else's machine was very difficult: but as well, by the time some of these problems were being overcome, mini-computers had appeared on the scene and the economics of time sharing had changed dramatically.So it's reasonable to say that ARPANET failed in its purpose, but in the process it made some significant discoveries that were to result in the creation of the first Internet. These included email developments, packet switching implementations, and development of the (Transport Control Protocol - Internet Protocol) or TCP/IP.TCP/IP is the backbone protocol which technical people claim is the basis for determining what the Internet is. It was developed in the 1970s in California by Vinton Cerf, Bob Kahn, Bob Braden, Jon Postel and other members of the Networking Group headed by Steve Crocker. TCP/IP was developed to solve problems with earlier attempts at communication between computers undertaken by ARPANET.Vinton Cerf had worked on the earlier Arpanet protocols while at the University of California in Los Angeles from 1968-1972. He moved to Stanford University in late 1972. At the same time Bob Kahn, who had been the chief architect of the Arpanet while working for contracting form Bolt Beranek and Newman, left that firm and joined ARPANET.In October 1972 ARPANET publicly demonstrated their system for the first time at the International Computer Communications Conference in Washington DC. Following that meeting, an International Networking Group chaired by Vinton Cerf was established. Bob Kahn visited Stanford in the spring of 1973 and he and Vint Cerf discussed the problem of interconnecting multiple packet networks that were NOT identical. They developed the basic concepts of TCP at that time, and presented it to the newly established International Networking Group. This meeting and this development really rates as the beginning of the Internet. Nobody knows who first used the word Internet - it just became a shortcut around this time for "internetworking". The earliest written use of the word appears to be by Vint Cerf in 1974.By 1975 the first prototype was being tested. A few more years were spent on technical development, and in 1978 TCP/IPv4 was released.It would be some time before it became available to the rest of us. In fact, TCP/IP was not even added to Arpanet officially until 1983.So we can see that the Internet began as an unanticipated result of an unsuccessful military and academic research program component, and was more a product of the US west coast culture of the 1980s than a product of the post-war Pentagon era.
History of the World Wide Web
Before the World Wide Web the Internet really only provided screens full of text (and usually only in one font and font size). So although it was pretty good for exchanging information, and indeed for accessing information such as the Catalogue of the US Library of Congress, it was visually very boring.In an attempt to make this more aesthetic, companies like Compuserve and AOL began developing what used to be called GUIs (or graphical user interfaces). GUIs added a bit of colour and a bit of layout, but were still pretty boring. Indeed IBM personal computers were only beginning to adopt Windows interfaces - before that with MSDOS interfaces they were pretty primitive. So the Internet might have been useful, but it wasn't good looking.Probably the World Wide Web saved the net. Not only did it change its appearance, it made it possible for pictures and sound to be displayed and exchanged.The web had some important predecessors, perhaps the most significant of these being Ted Nelson's Xanadu project, which worked on the concept of Hypertext - where you could click on a word and it would take you somewhere else. Ted Nelson envisaged with Xanadu a huge library of all the worlds' information. In order to click on hyperlinks, as they were called, Douglas Engelbart invented the mouse, which was to later become a very important part of personal computers. So the idea of clicking on a word or a picture to take you somewhere else was a basic foundation of the web.Another important building block was the URL or Uniform Resource Locator. This allowed you a further option to find your way around by naming a site. Every site on the worldwide web has a unique URL The other feature was Hypertext Markup Language (html), the language that allowed pages to display different fonts and sizes, pictures, colours etc. Before HTML, there was no such standard, and the "GUIs we talked about before only belonged to different computers or different computer software. They could not be networked.It was Tim Berners Lee who brought this all together and created the World Wide Web. The first trials of the World Wide Web were at the CERN laboratories of in Switzerland in December 1990. By 1991 browser and web server software was available, and by 1992 a few preliminary sites existed in places like University of Illinois, where Mark Andreesen became involved. By the end of 1992, there were about 26 sites.The first browser which became popularly available to take advantage of this was Mosaic, in 1993. Mosaic was as slow as a wet week, and really didn't handle downloading pictures well at all - so the early world wide web experience with Mosaic, and with domestic modems that operated at one sixths of current modem speeds at best, were pretty lousy and really didn't give much indication of the potential of this medium. On April 30, 1993 CERN's directors made a statement that was a true milestone in Internet history. On this day, they declared that WWW technology would be freely usable by anyone, with no fees being payable to CERN. This decision - much in line with the decisions of the earlier Internet pioneers to make their products freely available - was a visionary and important one. The browser really did begin to change everything. By the end of 1994 there were a million browser copies in use - rapid growth indeed!! Then we really started to see growth. Every year from 1994 to 2000, the Internet saw massive growth, the like of which had not been seen with any preceding technology. The Internet era had begun.The first search engines began to appear in the mid 1990s, and it didn't take long for Google to come on the scene, and establish a dominant market position.In the early days, the web was mainly used for displaying information. On line shopping, and on line purchase of goods, came a little bit later. The first large commercial site was Amazon, a company which in its initial days concentrated solely on book markets. The Amazon concept was developed in 1994, a year in which some people claim the world wide web grew by an astonishing 2300 percent! Amazon saw that on line shopping was the way of the future, and chose the book market as a field where much could be achieved.By 1998 there were 750,000 commercial sites on the world wide web, and we were beginning to see how the Internet would bring about significant changes to existing industries. In travel for instance, we were able to compare different airlines and hotels and get the cheapest fares and accommodation - something pretty difficult for individuals to do before the world wide web. Hotels began offering last minute rates through specially constructed websites, thus furthering the power of the web as a sales medium.
Internet revolution
It all started in October 1969. Scientists at the University of California, Los Angeles, were ready for a critical experiment. They had a computer and communications node, while colleagues installed similar equipment up the coast in Menlo Park. They planned to test whether they could link the two computers over telephone lines to operate as one system. The researchers began to tap in the message: 'log in' to make the link. The system crashed.
The commercial importance of this breakthrough was not fulfilled for another 25 years - just as the invention of the steam engine by James Watt in the 1780s did not become really useful and developed until the launch of the rail engine two decades late. Similarly the petrol combustion engine did not lead to cars and trucks for about two decades.
The significance of the internet is that it takes the computer and 'information technology' onto a new stage: computers now communicate with each other. That is producing a dramatic exponential increase in the speed of transmitting information. Computers and the microchip were for the era of the 1970s and 1980s. The 1990s and the first decade of the new Christian-based millennium are for telecommunications and the internet. The internet will expand across the globe just as the railroad did in the latter half of the 19th century and the motor car and airplane did in the latter half of the 20th century. The economic result will be a huge reduction in the time taken to transmit information and, with it, a fall in the costs of producing goods and services.
Internet commerce
By 2003 there will be 500m people connected to the internet, or about 10% of the world's population. By 2003, over 65% of US households will be connected. In the same way that the railroad, motor car, electricity connections and the airplane developed huge new industries and capitalist conglomerates, new internet companies are springing up as fast as you can say .com. By 2004, it is estimated that worldwide business-to-business internet commerce will reach over $7trn. Internet commerce in the US will reach $2.5trn, or about 25% of the annual output of the US in that year. Already the internet companies have grown bigger in size than the former technology giants (airlines, publishing and healthcare) and are catching up the automobile industry quickly.
The impact of the internet revolution has already been felt on economic growth. The information technology sectors as a whole (computers, telecomm, internet, software etc) in the US are growing at double the rate of the rest of the US economy. They now contribute over 8% of US annual output on their own. Indeed, since 1993, without the IT sectors, US economic growth would have been 1% of GDP lower each year. In 1999, nearly 80% of all investment by capitalist companies in the US went into information technology sectors! Over one million jobs have been created by the US high-tech sector since 1993 and there are now 8m people working there at generally 50% higher average wages.
Salvation from the crisis?
Now in the decade of the 2000s it will be the internet. The technological marvel of the internet will not save capitalism from crisis, just as the railroad did not in 1880s and the automobile did not in the 1930s. Indeed, for some very good reasons, it will exacerbate the inevitable slump in capitalist prosperity. The first reason is that, just as the railroad and automobile before, the internet is drastically lowering costs. But this is a huge deflationary force on capitalist companies' ability to keep up prices. Intense competition and huge investment of capital is boosting economic growth now, but it is doing so at the expense of capitalist profitability.
Internet companies do not make any profit. They remain a huge cost to the rest of the economy. But investment in the new technology has become a necessity to compete. This necessity has leapt well beyond the ability to garner surplus value from the investment. Just the top nine Internet companies are worth $100bn on the US stock market. But they make sales of just $1bn, or 1%. And that is just sales. They make no profit. Compare that even to the ten leading technology companies like Microsoft. They are worth only $50bn, but they make $100bn in annual revenues (and some of these make a profit too!). Overinvestment and overcapacity will be the outcome of the internet boom.
The internet and IT revolution is a huge deflationary force on the capitalist economy. That is the result of system that develops technology through competition and private capitalist investment. Intense competition means that very quickly the profitable advantages gained by the first company to use the new technology quickly disappear. The eventual outcome is that everybody uses the new technology and nobody gains extra profit as a result. Investment eventually shoots up much faster than the extra productivity of labour created.
Just a matter of time
It is only a matter of time before the US internet bubble is burst, investments collapse and consumption of the masses falls back because of a loss of confidence in the 'new economy'. The internet revolution is a great technical leap forward. But under capitalism, it is being exploited by more and more precious investment capital being thrown into this tiny sector of the economy at the expense of all the rest. That happens under capitalism because there is no planning and no direction of resources. 'Market forces' mean speculative investment, intense competition between capitalist investors, and above all, huge over-investment in relation to profitability.
The canal share boom of 1835-36 was followed by slump and falling prices. The railway stock mania of 1869-73 was followed by the biggest depression then seen under capitalism. The same was seen in the aftermath of the share boom of the 1920s. Japan's stock market bubble of the 1980s has been followed by ten years of stagnation and recession. The optimists of capitalism believe that the internet revolution is really a low-price low-cost boom that will last decades. The reality is that it is just another speculative financial market bubble that will turn into a deflationary bust. As I write just about everybody in the capitalist world, including former sceptics of internet stocks, now believe that internet companies will continue to drive upwards for the foreseeable future. When everybody agrees, you know it won't last much longer
Revolution of www
The advent of the World Wide Web fits every definition of a revolution. Within a few brief years, the Web has grown from yet another obscure Internet protocol into the world's most widely accepted source of information. Few today question or doubt the efficacy of the Web. For the delivery of digitized material in every conceivable medium and mode, the Web looms as the cornerstone of the information age.
The World Wide Web has changed the way educators think about computers. Across Michigan and across the nation, school districts — from the most financially strapped to the most affluent — are allocating funds and resources to build on-ramps to the information highway. Districts that formerly defeated millages now routinely vote specifically in support of technology for schools. Statistically, the odds are your school is now wired, or will be soon.
As music educators, it behooves us all to become Web-aware — for our students, our programs, our administrators, and our profession. Music resources that were beyond conception a few years ago now await our collective mouse-clicks. To take advantage of the Web, and to make our own contributions, each of us needs to become acquainted with the World Wide Web.
The Web is part of the Internet, a network of networks. Born in the 1950s, when the domain of computer users was confined to research institutions and the military, the Internet was a set of hardware and software standards for exchanging information among computer networks worldwide. A variety of protocols, for electronic mail, file transfer, etc., determined how information was transmitted and received, as long as the information was unadorned text. The Web updated the Internet for the 1990s, a world of desktop computers bearing multimedia capabilities and graphical user interfaces.
Many of the concepts of the Web can be traced to the work of Ted Nelson, the visionary and enigmatic technologist. In the 1970s Nelson began evangelizing his vision of a world in which information would be digital, ubiquitous, and interconnected. In his view, information would be linked in ways more logical than linear. For these links he coined the term "hypertext," and for the world of linked information, "Xanadu." Although careers, companies, and millions of dollars were lost in the failed effort to realize Xanadu, the concept of hypertext took on a life of its own. Computer programmers attempted to create their own realization of hypertext, most notably Bill Atkinson and his HyperCard software for the Macintosh.
Tim Berners-Lee, the father of the Web, claims he arrived at the concept of hypertext independently. Nonetheless, even Nelson has endorsed his implementation. In the early 1990s, Berners-Lee was employed as computer consultant for the Center for Particle Physics Research (CERN) in Geneva. Seeking to provide improved Internet services for the physicists at CERN, he submitted a detailed proposal in 1991 that described the major components of what we now know as the Web:
Hypertext - the means of linking a portion of text to another location within the document, to another document on the computer's hard drive, or to another document anywhere on the Internet;
Hypertext Markup Language (HTML) - a means of encoding text with embedded tags, supporting formatted text, hypertext, and additional media;
Hypertext Transfer Protocol (HTTP) - the means of transmitting and receiving HTML documents on the Internet;
Browser - "client" software that allows users to receive and view HTML documents on their computers.
In Berners-Lee's concept, HTML was to be platform-neutral; in fact, the first Web browser was created for the NeXT computer. In the true spirit of the Internet, CERN made the Web specifications public, allowing anyone to create a browser for interpreting HTML files.
Marc Andreessen did just that. A graduate student at the University of Illinois, Andreessen developed a Web browser called Mosaic for Macintosh and Windows. When the university made the software available free to the public in 1993, the unstoppable revolution had begun. More than ten thousand copies of Mosaic were being downloaded daily from the Illinois site. Users became enthralled with the nascent potential of the Web, and began creating their own Web sites. As the content of the Web grew, it attracted more users. By any measure, the growth was explosive.
History of the World Wide Web
Before the World Wide Web the Internet really only provided screens full of text (and usually only in one font and font size). So although it was pretty good for exchanging information, and indeed for accessing information such as the Catalogue of the US Library of Congress, it was visually very boring.In an attempt to make this more aesthetic, companies like Compuserve and AOL began developing what used to be called GUIs (or graphical user interfaces). GUIs added a bit of colour and a bit of layout, but were still pretty boring. Indeed IBM personal computers were only beginning to adopt Windows interfaces - before that with MSDOS interfaces they were pretty primitive. So the Internet might have been useful, but it wasn't good looking.Probably the World Wide Web saved the net. Not only did it change its appearance, it made it possible for pictures and sound to be displayed and exchanged.The web had some important predecessors, perhaps the most significant of these being Ted Nelson's Xanadu project, which worked on the concept of Hypertext - where you could click on a word and it would take you somewhere else. Ted Nelson envisaged with Xanadu a huge library of all the worlds' information. In order to click on hyperlinks, as they were called, Douglas Engelbart invented the mouse, which was to later become a very important part of personal computers. So the idea of clicking on a word or a picture to take you somewhere else was a basic foundation of the web.Another important building block was the URL or Uniform Resource Locator. This allowed you a further option to find your way around by naming a site. Every site on the worldwide web has a unique URL The other feature was Hypertext Markup Language (html), the language that allowed pages to display different fonts and sizes, pictures, colours etc. Before HTML, there was no such standard, and the "GUIs we talked about before only belonged to different computers or different computer software. They could not be networked.It was Tim Berners Lee who brought this all together and created the World Wide Web. The first trials of the World Wide Web were at the CERN laboratories of in Switzerland in December 1990. By 1991 browser and web server software was available, and by 1992 a few preliminary sites existed in places like University of Illinois, where Mark Andreesen became involved. By the end of 1992, there were about 26 sites.The first browser which became popularly available to take advantage of this was Mosaic, in 1993. Mosaic was as slow as a wet week, and really didn't handle downloading pictures well at all - so the early world wide web experience with Mosaic, and with domestic modems that operated at one sixths of current modem speeds at best, were pretty lousy and really didn't give much indication of the potential of this medium. On April 30, 1993 CERN's directors made a statement that was a true milestone in Internet history. On this day, they declared that WWW technology would be freely usable by anyone, with no fees being payable to CERN. This decision - much in line with the decisions of the earlier Internet pioneers to make their products freely available - was a visionary and important one. The browser really did begin to change everything. By the end of 1994 there were a million browser copies in use - rapid growth indeed!! Then we really started to see growth. Every year from 1994 to 2000, the Internet saw massive growth, the like of which had not been seen with any preceding technology. The Internet era had begun.The first search engines began to appear in the mid 1990s, and it didn't take long for Google to come on the scene, and establish a dominant market position.In the early days, the web was mainly used for displaying information. On line shopping, and on line purchase of goods, came a little bit later. The first large commercial site was Amazon, a company which in its initial days concentrated solely on book markets. The Amazon concept was developed in 1994, a year in which some people claim the world wide web grew by an astonishing 2300 percent! Amazon saw that on line shopping was the way of the future, and chose the book market as a field where much could be achieved.By 1998 there were 750,000 commercial sites on the world wide web, and we were beginning to see how the Internet would bring about significant changes to existing industries. In travel for instance, we were able to compare different airlines and hotels and get the cheapest fares and accommodation - something pretty difficult for individuals to do before the world wide web. Hotels began offering last minute rates through specially constructed websites, thus furthering the power of the web as a sales medium.
Internet revolution
It all started in October 1969. Scientists at the University of California, Los Angeles, were ready for a critical experiment. They had a computer and communications node, while colleagues installed similar equipment up the coast in Menlo Park. They planned to test whether they could link the two computers over telephone lines to operate as one system. The researchers began to tap in the message: 'log in' to make the link. The system crashed.
The commercial importance of this breakthrough was not fulfilled for another 25 years - just as the invention of the steam engine by James Watt in the 1780s did not become really useful and developed until the launch of the rail engine two decades late. Similarly the petrol combustion engine did not lead to cars and trucks for about two decades.
The significance of the internet is that it takes the computer and 'information technology' onto a new stage: computers now communicate with each other. That is producing a dramatic exponential increase in the speed of transmitting information. Computers and the microchip were for the era of the 1970s and 1980s. The 1990s and the first decade of the new Christian-based millennium are for telecommunications and the internet. The internet will expand across the globe just as the railroad did in the latter half of the 19th century and the motor car and airplane did in the latter half of the 20th century. The economic result will be a huge reduction in the time taken to transmit information and, with it, a fall in the costs of producing goods and services.
Internet commerce
By 2003 there will be 500m people connected to the internet, or about 10% of the world's population. By 2003, over 65% of US households will be connected. In the same way that the railroad, motor car, electricity connections and the airplane developed huge new industries and capitalist conglomerates, new internet companies are springing up as fast as you can say .com. By 2004, it is estimated that worldwide business-to-business internet commerce will reach over $7trn. Internet commerce in the US will reach $2.5trn, or about 25% of the annual output of the US in that year. Already the internet companies have grown bigger in size than the former technology giants (airlines, publishing and healthcare) and are catching up the automobile industry quickly.
The impact of the internet revolution has already been felt on economic growth. The information technology sectors as a whole (computers, telecomm, internet, software etc) in the US are growing at double the rate of the rest of the US economy. They now contribute over 8% of US annual output on their own. Indeed, since 1993, without the IT sectors, US economic growth would have been 1% of GDP lower each year. In 1999, nearly 80% of all investment by capitalist companies in the US went into information technology sectors! Over one million jobs have been created by the US high-tech sector since 1993 and there are now 8m people working there at generally 50% higher average wages.
Salvation from the crisis?
Now in the decade of the 2000s it will be the internet. The technological marvel of the internet will not save capitalism from crisis, just as the railroad did not in 1880s and the automobile did not in the 1930s. Indeed, for some very good reasons, it will exacerbate the inevitable slump in capitalist prosperity. The first reason is that, just as the railroad and automobile before, the internet is drastically lowering costs. But this is a huge deflationary force on capitalist companies' ability to keep up prices. Intense competition and huge investment of capital is boosting economic growth now, but it is doing so at the expense of capitalist profitability.
Internet companies do not make any profit. They remain a huge cost to the rest of the economy. But investment in the new technology has become a necessity to compete. This necessity has leapt well beyond the ability to garner surplus value from the investment. Just the top nine Internet companies are worth $100bn on the US stock market. But they make sales of just $1bn, or 1%. And that is just sales. They make no profit. Compare that even to the ten leading technology companies like Microsoft. They are worth only $50bn, but they make $100bn in annual revenues (and some of these make a profit too!). Overinvestment and overcapacity will be the outcome of the internet boom.
The internet and IT revolution is a huge deflationary force on the capitalist economy. That is the result of system that develops technology through competition and private capitalist investment. Intense competition means that very quickly the profitable advantages gained by the first company to use the new technology quickly disappear. The eventual outcome is that everybody uses the new technology and nobody gains extra profit as a result. Investment eventually shoots up much faster than the extra productivity of labour created.
Just a matter of time
It is only a matter of time before the US internet bubble is burst, investments collapse and consumption of the masses falls back because of a loss of confidence in the 'new economy'. The internet revolution is a great technical leap forward. But under capitalism, it is being exploited by more and more precious investment capital being thrown into this tiny sector of the economy at the expense of all the rest. That happens under capitalism because there is no planning and no direction of resources. 'Market forces' mean speculative investment, intense competition between capitalist investors, and above all, huge over-investment in relation to profitability.
The canal share boom of 1835-36 was followed by slump and falling prices. The railway stock mania of 1869-73 was followed by the biggest depression then seen under capitalism. The same was seen in the aftermath of the share boom of the 1920s. Japan's stock market bubble of the 1980s has been followed by ten years of stagnation and recession. The optimists of capitalism believe that the internet revolution is really a low-price low-cost boom that will last decades. The reality is that it is just another speculative financial market bubble that will turn into a deflationary bust. As I write just about everybody in the capitalist world, including former sceptics of internet stocks, now believe that internet companies will continue to drive upwards for the foreseeable future. When everybody agrees, you know it won't last much longer
Revolution of www
The advent of the World Wide Web fits every definition of a revolution. Within a few brief years, the Web has grown from yet another obscure Internet protocol into the world's most widely accepted source of information. Few today question or doubt the efficacy of the Web. For the delivery of digitized material in every conceivable medium and mode, the Web looms as the cornerstone of the information age.
The World Wide Web has changed the way educators think about computers. Across Michigan and across the nation, school districts — from the most financially strapped to the most affluent — are allocating funds and resources to build on-ramps to the information highway. Districts that formerly defeated millages now routinely vote specifically in support of technology for schools. Statistically, the odds are your school is now wired, or will be soon.
As music educators, it behooves us all to become Web-aware — for our students, our programs, our administrators, and our profession. Music resources that were beyond conception a few years ago now await our collective mouse-clicks. To take advantage of the Web, and to make our own contributions, each of us needs to become acquainted with the World Wide Web.
The Web is part of the Internet, a network of networks. Born in the 1950s, when the domain of computer users was confined to research institutions and the military, the Internet was a set of hardware and software standards for exchanging information among computer networks worldwide. A variety of protocols, for electronic mail, file transfer, etc., determined how information was transmitted and received, as long as the information was unadorned text. The Web updated the Internet for the 1990s, a world of desktop computers bearing multimedia capabilities and graphical user interfaces.
Many of the concepts of the Web can be traced to the work of Ted Nelson, the visionary and enigmatic technologist. In the 1970s Nelson began evangelizing his vision of a world in which information would be digital, ubiquitous, and interconnected. In his view, information would be linked in ways more logical than linear. For these links he coined the term "hypertext," and for the world of linked information, "Xanadu." Although careers, companies, and millions of dollars were lost in the failed effort to realize Xanadu, the concept of hypertext took on a life of its own. Computer programmers attempted to create their own realization of hypertext, most notably Bill Atkinson and his HyperCard software for the Macintosh.
Tim Berners-Lee, the father of the Web, claims he arrived at the concept of hypertext independently. Nonetheless, even Nelson has endorsed his implementation. In the early 1990s, Berners-Lee was employed as computer consultant for the Center for Particle Physics Research (CERN) in Geneva. Seeking to provide improved Internet services for the physicists at CERN, he submitted a detailed proposal in 1991 that described the major components of what we now know as the Web:
Hypertext - the means of linking a portion of text to another location within the document, to another document on the computer's hard drive, or to another document anywhere on the Internet;
Hypertext Markup Language (HTML) - a means of encoding text with embedded tags, supporting formatted text, hypertext, and additional media;
Hypertext Transfer Protocol (HTTP) - the means of transmitting and receiving HTML documents on the Internet;
Browser - "client" software that allows users to receive and view HTML documents on their computers.
In Berners-Lee's concept, HTML was to be platform-neutral; in fact, the first Web browser was created for the NeXT computer. In the true spirit of the Internet, CERN made the Web specifications public, allowing anyone to create a browser for interpreting HTML files.
Marc Andreessen did just that. A graduate student at the University of Illinois, Andreessen developed a Web browser called Mosaic for Macintosh and Windows. When the university made the software available free to the public in 1993, the unstoppable revolution had begun. More than ten thousand copies of Mosaic were being downloaded daily from the Illinois site. Users became enthralled with the nascent potential of the Web, and began creating their own Web sites. As the content of the Web grew, it attracted more users. By any measure, the growth was explosive.
SYSTEM DEVELOPMENT LIFE CYCLE
A System development life cycle consists of the following five phases.
1. preliminary investigation
2. system analysis
3. system design
4. system acquisition
5. system implementation
PRELIMINARY INVESTIGATION
During this the analyst studies the problem briefly and suggests solutions to the management. it is one of the steps towards SDLC it is also called feasibility study. Feasibility Study — asks whether the managements' concept of their desired new system is actually an achievable, realistic goal, in-terms of money, time and end result difference to the original system. Often, it may be decided to simply update an existing system, rather than to completely replace one; The purpose of this study is to evaluate and define the problem area at hand relatively quickly, to see if it is worthy for study, and to suggest some possible courses for action.
In this phase analyst studies about the requirements of the system and the cost and benefit analysis.
SYSTEM ANALYSIS
It is an important stage of SDLC. In this phase the management decides that further development is wanted and the analyst studies the application area in depth. Analysis — free from any cost or unrealistic constraints, this stage lets minds run wild as 'wonder systems' can be thought-up, though all must incorporate everything asked for by the management in the Terms Of Reference section.
System analysis is a phase in which a problem is studied in depth and the needs of system users is assessed. The main activities conducted during system analysis are
· data collection
· data flow
· data analysis
· documentation
Data collection is to gather about the type of work being performed in the application under study and is ascertain what resources users need to better perform their jobs.
Data analysis the information gathered about the application, it must be analyzed that conclusions about the requirements about the new system can be drawn.
SYSTEM DESIGNS
System designs include the following things.
· DFD(data flow diagram)
· Data dictionary
· Normalization
Analyst designs a new model of the system and prepares a detailed list of benefits and costs. System design focuses on how the system will look like it consists of developing a model of new system and performing a detailed analysis of the system and costs.
Developing a Model of the New System
Once the analysis understands the nature of the design problem, it is usually helpful to draw the diagrams of the new system.
Analyzing Benefits and Costs
Most organization are actually sensitive to costs including computer system costs. Costs of a new computer system include both the initial investment in hardware and software and ongoing expenses such as personal and maintenance.
Design — designers will produce one or more 'models' of what they see a system eventually looking like, with ideas from the analysis section either used or discarded. A document will be produced with a description of the system, but nothing is specific — they might say 'touchscreen' or 'GUI operating system', but not mention any specific brands
o Designing the technical architecture – choosing amongst the architectural designs of telecommunications, hardware and software that will best suit the organization’s system and future needs
o Designing the systems model – graphically creating a model from graphical user interface (GUI), GUI screen design, and databases, to placement of objects on screen
o Write the test conditions - Work with the end users to develop the test scripts according to the system requirements
SYSTEM ACQUISITION
Upon the managments aproval of the design the analyst decides which vendors to use inorder to meet hardware software and servicing needs.In this phase the the software, hardware have been specified.
RFPs and RFQs
Many organizations formulate their byingor leasing needs by preparing a document called request for proposal.
Some organizations know exactly mhich hardware, software and server sources it sends a vendor a document called a request for proposal.
1. preliminary investigation
2. system analysis
3. system design
4. system acquisition
5. system implementation
PRELIMINARY INVESTIGATION
During this the analyst studies the problem briefly and suggests solutions to the management. it is one of the steps towards SDLC it is also called feasibility study. Feasibility Study — asks whether the managements' concept of their desired new system is actually an achievable, realistic goal, in-terms of money, time and end result difference to the original system. Often, it may be decided to simply update an existing system, rather than to completely replace one; The purpose of this study is to evaluate and define the problem area at hand relatively quickly, to see if it is worthy for study, and to suggest some possible courses for action.
In this phase analyst studies about the requirements of the system and the cost and benefit analysis.
SYSTEM ANALYSIS
It is an important stage of SDLC. In this phase the management decides that further development is wanted and the analyst studies the application area in depth. Analysis — free from any cost or unrealistic constraints, this stage lets minds run wild as 'wonder systems' can be thought-up, though all must incorporate everything asked for by the management in the Terms Of Reference section.
System analysis is a phase in which a problem is studied in depth and the needs of system users is assessed. The main activities conducted during system analysis are
· data collection
· data flow
· data analysis
· documentation
Data collection is to gather about the type of work being performed in the application under study and is ascertain what resources users need to better perform their jobs.
Data analysis the information gathered about the application, it must be analyzed that conclusions about the requirements about the new system can be drawn.
SYSTEM DESIGNS
System designs include the following things.
· DFD(data flow diagram)
· Data dictionary
· Normalization
Analyst designs a new model of the system and prepares a detailed list of benefits and costs. System design focuses on how the system will look like it consists of developing a model of new system and performing a detailed analysis of the system and costs.
Developing a Model of the New System
Once the analysis understands the nature of the design problem, it is usually helpful to draw the diagrams of the new system.
Analyzing Benefits and Costs
Most organization are actually sensitive to costs including computer system costs. Costs of a new computer system include both the initial investment in hardware and software and ongoing expenses such as personal and maintenance.
Design — designers will produce one or more 'models' of what they see a system eventually looking like, with ideas from the analysis section either used or discarded. A document will be produced with a description of the system, but nothing is specific — they might say 'touchscreen' or 'GUI operating system', but not mention any specific brands
o Designing the technical architecture – choosing amongst the architectural designs of telecommunications, hardware and software that will best suit the organization’s system and future needs
o Designing the systems model – graphically creating a model from graphical user interface (GUI), GUI screen design, and databases, to placement of objects on screen
o Write the test conditions - Work with the end users to develop the test scripts according to the system requirements
SYSTEM ACQUISITION
Upon the managments aproval of the design the analyst decides which vendors to use inorder to meet hardware software and servicing needs.In this phase the the software, hardware have been specified.
RFPs and RFQs
Many organizations formulate their byingor leasing needs by preparing a document called request for proposal.
Some organizations know exactly mhich hardware, software and server sources it sends a vendor a document called a request for proposal.
Subscribe to:
Posts (Atom)