Sunday, December 9, 2007

TCP/IP

TCP and IP were developed by a Department of Defense (DOD) research project to connect a number different networks designed by different vendors into a network of networks (the "Internet"). It was initially successful because it delivered a few basic services that everyone needs (file transfer, electronic mail, remote logon) across a very large number of client and server systems. Several computers in a small department can use TCP/IP (along with other protocols) on a single LAN. The IP component provides routing from the department to the enterprise network, then to regional networks, and finally to the global Internet. On the battlefield a communications network will sustain damage, so the DOD designed TCP/IP to be robust and automatically recover from any node or phone line failure. This design allows the construction of very large networks with less central management. However, because of the automatic recovery, network problems can go undiagnosed and uncorrected for long periods of time.

As with all other communications protocol, TCP/IP is composed of layers:

IP - is responsible for moving packet of data from node to node. IP forwards each packet based on a four byte destination address (the IP number). The Internet authorities assign ranges of numbers to different organizations. The organizations assign groups of their numbers to departments. IP operates on gateway machines that move data from department to organization to region and then around the world.

TCP - is responsible for verifying the correct delivery of data from client to server. Data can be lost in the intermediate network. TCP adds support to detect errors or lost data and to trigger retransmission until the data is correctly and completely received.

Sockets - is a name given to the package of subroutines that provide access to TCP/IP on most systems.

Friday, December 7, 2007

Fiber-Optic Cable

It is a cabling technology that uses optical fibers to carry digital data signals in the form of modulated pulses of light. The core of fiber-optic cable is made of glass or plastic and the cladding that are enclosed by a protective coating. Outer insulating jacket is made of Teflon or PVC. Kevlar fibers are used to strengthen the cable and prevent from breakages.

A brief overview of the advantages of fiber-optic cable over coaxial and twisted pair:

Speed: Fiber optic networks operate at high speeds in the gigabits range.

Bandwidth: High-capacity data transmission.

Distance: Due to lack of attenuation Signals can be transmitted over long distances without using repeaters.

Resistance: Greater resistance to outside interferences such as radios, motors and electromagnetic noise.

Expensive and very fragile.

Twisted Cables

The twisting of two insulated wires around each other is known as twisted pair cables. Cables that use twisted pair of wires reduce the Crosstalk between them to a great extent.
As shown in the image, the pair of wires may be surrounded either by a shield, or similar pairs of wires. Each pair is uniquely color coded when packaged in multiple pairs. Different uses such as Analog, Digital, and Ethernet require different pair multiples.

Twisted pair wiring is commonly used to connect telephones and in computer network technology and are classified according to their maximum transmission frequencies. Today, there are basically only two types used and they are Cat3 and Cat5.

Some features of Twisted pair cabling are:
It is flexible and easy to terminate.
For Cat 5e/6/7 UTP the maximum segment length is 100 meters.

10BaseT refers to the specifications for unshielded twisted pair cable (Category 3, 4, or 5) carrying Ethernet signals. Category 6 is relatively new and is used for gigabit connections.

Twisted pair cables are available in two variants:
Shielded Twisted Pair (STP) and
Unshielded Twisted pair (UTP)

Plenum Cable

Plenum grade cable is a cable that uses fire resistant material such as Teflon for insulation. This type of material in the insulation minimizes the amount of smoke in case of fire. However, plenum cabling are more expensive and less flexible than a PVC cable. This type of cable is generally used in space reserver for air circulation in air conditioning and heating systems

Coaxial Cable

Coaxial cable is commonly used in the cable television industry. This has also gained popularity in use for computer networks, such as Ethernet networks. Coaxial cable is highly resistant to signal interference and can support greater distance between network devices than twisted pair cable.
Coaxial cable consists of a single core copper wire surrounded by an insulator and enclosed in a copper mesh, finally covered inside an outside insulation

The main features co-axial cable:
When compared to twist pair cable a coaxial cable can support greater cable ths between network devices.
They are highly resistant to signal interference.
It costs less that other cables.
Not flexible and it is more difficult to terminate.

Types of Cable

A cable is a group of insulated conductors enclosed within an Insulator. The main function of a cable is transmitting signal from one point to another. They can be broadly categorized into 5 types depending on their attenuation, cost, immunity to EMI, skilled labor required for installation etc.
Given below is the list of cable types used in networking.

Coaxial
RG6
RG8
RG58
RG59

Plenum/PVC

UTP
CAT3
CAT5/e
CAT6

STP

Fiber
Single-mode
Multi-mode

Networking Cable

Cable is the medium through which data is transmitted from one network device to another. There are several types of cables, which are commonly used with LANs. The growth of Local Area Networks (LANs) and client server computing has produced a great increase in the amount of cabling used.
In many cases, a network will use a variety of cable types. The type of cable chosen for a network is related to the network's topology, protocol, and size. Also the common transmission losses such as attenuation have to be taken care of while choosing the type of cable used.

Wednesday, December 5, 2007

"XHTML"

XHTML stands for eXtensable HyperText Markup Language and is a cross between HTML and XML. XHTML was created for two main reasons:
To create a stricter standard for making web pages, reducing incompatibilities between browsers
To create a standard that can be used on a variety of different devices without changes The great thing about XHTML, though, is that it is almost the same as HTML, although it is much more important that you create your code correctly. You cannot make badly formed code to be XHTML compatible. Unlike with HTML (where simple errors (like missing out a closing tag) are ignored by the browser), XHTML code must be exactly how it is specified to be. This is due to the fact that browsers in handheld devices etc. don't have the power to show badly formatted pages so XHTML makes sure that the code is correct so that it can be used on any type of browser.XHTML is a web standard which has been agreed by the W3C and, as it is backwards compatible, you can start using it in your webpages now. Also, even if you don't think its really necessary to update to XHTML yet, there are three very good reasons to do so:

It will help you to create better formatted code on your site

It will make your site more accessable (both in the future and now due to the fact that it will also mean you have correct HTML and most browsers will show your page better)

XHTML is planned to replace HTML 4 in the future There is really no excuse not to start writing your web pages using XHTML as it is so easy to pick up and will bring many benefits to your site.

Tuesday, December 4, 2007

List of Common Headers

· Apparently-To: Messages with many recipients sometimes have a long list of headers of the form "Apparently-To: rth@bieberdorf.edu" (one line per recipient). These headers are unusual in legitimate mail; they are normally a sign of a mailing list, and in recent times mailing lists have generally used software sophisticated enough not to generate a giant pile of headers.

· Bcc: (stands for "Blind Carbon Copy") If you see this header on incoming mail, something is wrong. It's used like Cc: (see below), but does not appear in the headers. The idea is to be able to send copies of email to persons who might not want to receive replies or to appear in the headers. Blind carbon copies are popular with spammers, since it confuses many inexperienced users to get email that doesn't appear to be addressed to them.

· Cc: (stands for "Carbon Copy", which is meaningful if you remember typewriters) This header is sort of an extension of "To:"; it specifies additional recipients. The difference between "To:" and "Cc:" is essentially connotative; some mailers also deal with them differently in generating replies.

· Comments: This is a nonstandard, free-form header field. It's most commonly seen in the form "Comments: Authenticated sender is ". A header like this is added by some mailers (notably the popular freeware program Pegasus) to identify the sender; however, it is often added by hand (with false information) by spammers as well. Treat with caution.

· Content-Transfer-Encoding: This header relates to MIME, a standard way of enclosing non-text content in email. It has no direct relevance to the delivery of mail, but it affects how MIME-compliant mail programs interpret the content of the message.
· Content-Type: Another MIME header, telling MIME-compliant mail programs what type of content to expect in the message.

· Date: This header does exactly what you'd expect: It specifies a date, normally the date the message was composed and sent. If this header is omitted by the sender's computer, it might conceivably be added by a mail server or even by some other machine along the route. It shouldn't be treated as gospel truth; forgeries aside, there are an awful lot of computers in the world with their clocks set wrong.

· Errors-To: Specifies an address for mailer-generated errors, like "no such user" bounce messages, to go to (instead of the sender's address). This is not a particularly common header, as the sender usually wants to receive any errors at the sending address, which is what most (essentially all) mail server software does by default.

· From (without colon) This is the "envelope From" discussed above.

· From: (with colon) This is the "message From:" discussed above.

· Message-Id: (also Message-id: or Message-ID:) The Message-Id is a more-or-less unique identifier assigned to each message, usually by the first mailserver it encounters. Conventionally, it is of the form "gibberish@bieberdorf.edu", where the "gibberish" part could be absolutely anything and the second part is the name of the machine that assigned the ID. Sometimes, but not often, the "gibberish" includes the sender's username. Any email in which the message ID is malformed (e.g., an empty string or no @ sign), or in which the site in the message ID isn't the real site of origin, is probably a forgery.

· In-Reply-To: A Usenet header that occasionally appears in mail, the In-Reply-To: header gives the message ID of some previous message which is being replied to. It is unusual for this header to appear except in email directly related to Usenet; spammers have been known to use it, probably in an attempt to evade filtration programs.

· Mime-Version: (also MIME-Version:) Yet another MIME header, this one just specifying the version of the MIME protocol that was used by the sender. Like the other MIME headers, this one is usually eminently ignorable; most modern mail programs will do the right thing with it.

· Newsgroups: This header only appears in email that is connected with Usenet---either email copies of Usenet postings, or email replies to postings. In the first case, it specifies the newsgroup(s) to which the message was posted; in the second, it specifies the newsgroup(s) in which the message being replied to was posted. The semantics of this header are the subject of a low-intensity holy war, which effectively assures that both sets of semantics will be used indiscriminately for the foreseeable future.


· Organization: A completely free-form header that normally contains the name of the organization through which the sender of the message has net access. The sender can generally control this header, and silly entries like "Royal Society for Putting Things on Top of Other Things" are commonplace.

· Priority: An essentially free-form header that assigns a priority to the mail. Most software ignores it. It is often used by spammers, usually in the form "Priority: urgent" (or something similar), in an attempt to get their messages read.

· Received: Discussed in detail above.

· References: The References: header is rare in email except for copies of Usenet postings. Its use on Usenet is to identify the "upstream" posts to which a message is a response; when it appears in email, it's usually just a copy of a Usenet header. It may also appear in email responses to Usenet postings, giving the message ID of the post being responded to as well as the references from that post.

· Reply-To: Specifies an address for replies to go to. Though this header has many legitimate uses (perhaps your software mangles your From: address and you want replies to go to a correct address), it is also widely used by spammers to deflect criticism. Occasionally a naive spammer will actually solicit responses by email and use the Reply-To: header to collect them, but more often the Reply-To: address in junk email is either invalid or an innocent victim.
· Sender: This header is unusual in email (X-Sender: is usually used instead), but appears occasionally, especially in copies of Usenet posts. It should identify the sender; in the case of Usenet posts, it is a more reliable identifier than the From: line.

· Subject: A completely free-form field specified by the sender, intended, of course, to describe the subject of the message.

· To: The "message To: "described above. Note that the To: header need not contain the recipient's address!

· X-headers is the generic term for headers starting with a capital X and a hyphen. The convention is that X-headers are nonstandard and provided for information only, and that, conversely, any nonstandard informative header should be given a name starting with "X-". This convention is frequently violated.

· X-Confirm-Reading-To: This header requests an automated confirmation notice when the message is received or read. It is typically ignored; presumably some software acts on it.

· X-Distribution: In response to problems with spammers using his software, the author of Pegasus Mail added this header. Any message sent with Pegasus to a sufficiently large number of recipients has a header added that says "X-Distribution: bulk". It is explicitly intended as something for recipients to filter against.

· X-Errors-To: Like Errors-To:, this header specifies an address for errors to be sent to. It is probably less widely obeyed.

· X-Mailer: (also X-mailer:) A freeform header field intended for the mail software used by the sender to identify itself (as advertising or whatever). Since much junk email is sent with mailers invented for the purpose, this field can provide much useful fodder for filters.

· X-PMFLAGS: This is a header added by Pegasus Mail; its semantics are nonobvious. It appears in any message sent with Pegasus, so it doesn't obviously convey any information to the recipient that isn't covered by the X-Mailer: header.

· X-Priority: Another priority field, used notably by Eudora to assign a priority (which appears as a graphical notation on the message).

· X-Sender: The usual email analogue to the Sender: header in Usenet news, this header purportedly identifies the sender with greater reliability than the From: header. In fact, it is nearly as easy to forge, and should therefore be viewed with the same sort of suspicion as the From: header.

· X-UIDL: This is a unique identifier used by the POP protocol for retrieving mail from a server. It is normally added between the recipient's mail server and the recipient's actual mail software; if mail arrives at the mail server with an X-UIDL: header, it is probably junk (there's no conceivable use for such a header, but for some unknown reason many spammers add one).

Post Office Protocol

Post Office Protocol version 3 (POP3) is an application layer Internet standard protocol used to retrieve email from a remote server to a local client over a TCP/IP connection. Nearly all individual Internet service provider email accounts are accessed via POP3.
The earlier versions of the POP protocol, POP (informally called POP1) and POP2, have been thoroughly made obsolete by POP3. In contemporary usage, the less precise term POP almost always means POP3 in the context of email protocols.
POP3 and its predecessors are designed to allow end users with intermittent connections such as dial-up connections to retrieve email when connected, and then to view and manipulate the retrieved messages without needing to stay connected. Although most clients have an option to leave mail on server, email clients using POP3 generally connect, retrieve all messages, store them on the user's PC as new messages, delete them from the server, and then disconnect. In contrast, the newer, more capable IMAP email retrieval protocol supports both connected and disconnected modes of operation. Email clients using IMAP generally leave messages on the server until the user explicitly deletes them. This and other facets of IMAP operation allow multiple clients to access the same mailbox. Most email clients can be configured to use either POP3 or IMAP to retrieve messages; however, ISP support for IMAP is not as common.
UIDL (Unique IDentification Listing) is a POP3 command typically used in the implementation of a client leave mail on server option. POP3 commands identify specific messages by their ordinal number on the mail server. This creates a problem for a client intending to leave messages on the server, since these message numbers may change from one connection to the server to another. For example if there were five messages when last connected and message #3 is deleted by a different client, when next connecting the last two messages' numbers decrement by one! Luckily, the POP3 RFC specifies a method of avoiding numbering issues. Basically, the server assigns an arbitrary and unique string of characters in the range 0x21 to 0x7E to the message. This ID is never reused for any message. When a POP3-compatible email client connects to the server, it can use the UIDL command to get the current mapping from these message IDs to the ordinal message numbers. Using this mapping the client can then determine which messages it has yet to download, which saves time when downloading.
Whether using POP3 or IMAP to retrieve messages, clients use the SMTP protocol to send messages. Email clients are sometimes referred to as either POP or IMAP clients, but in both cases SMTP is also used.
Email attachments and non-ASCII text are nearly universally conveyed in email in accordance with MIME formatting rules. Neither POP3 nor SMTP require email to be MIME formatted, but since essentially all internet email is MIME formatted POP clients by default must also understand and use MIME. IMAP is designed to assume email is MIME formatted.
Like many other older Internet protocols, POP3 originally supported only an unencrypted login mechanism. Although plain text transmission of passwords in POP3 is still common, POP3 currently supports several authentication methods to provide varying levels of protection against illegitimate access to a user's email. One such method (defined in the base specification as an optional command) is APOP, which uses MD5 in an attempt to avoid replay attacks and disclosure of a shared secret; clients implementing APOP include Mozilla, Thunderbird, Eudora, and Novell Evolution. POP3 can also support IMAP authentication methods via the AUTH extension.
It is also possible to encrypt POP3 traffic using SSL.
POP3 works over a TCP/IP connection using network port 110.

Binary and ASCII transfers

Binary transfers is for binary files, such as executable, graphic or compressed files. ASCII transfers is for text files, such as HTML documents and pure ASCII text (such as notepad files). If you use WS-FTP or CuteFTP as your FTP program, check the AUTO box at the bottom of the program screen to have the software try and determine what method to use automatically. If you transfer Binary files using ASCII transfer method, you will corrupt your files on the server, making them unreadable. If you transfer ASCII files using Binary transfer method you will alter the formatting of the text contained in the file itself. This doesn't seem like such a big deal, until you try to open this file again. Strange things will happen, such as all the carriage returns being stripped out, or if the uploaded HTML or text file needs to be opened by a UNIX-based text editor such as VI or PICO, the user will see a ^M at the end of each line, very ugly and difficult to read.

HTTP

HTTP stands for Hypertext Transfer Protocol. It's the network protocol used to deliver virtually all files and other data (collectively called resources) on the World Wide Web, whether they're HTML files, image files, query results, or anything else. Usually, HTTP takes place through TCP/IP sockets (and this tutorial ignores other possibilities).

A browser is an HTTP client because it sends requests to an HTTP server (Web server), which then sends responses back to the client. The standard (and default) port for HTTP servers to listen on is 80, though they can use any port.

File Transfer Protocol (FTP)

File Transfer Protocol, or FTP, is a protocol used for transferring files from one Internet location to another. FTP is the preferred method of exchanging files, because it's faster than other protocols like HTTP. FTP is the method by which you'll upload files into your webhosting account. FTP data is sent and received through port 21. The transfer is asynchronous, and therefore faster.

The Original Objectives of FTP are:

1. To promote sharing of files (computer programs and/or data),
2. To encourage indirect or implicit (via programs) use of remote computers,
3. To shield a user from variations in file storage systems among hosts, and
4. To transfer data reliably and efficiently. FTP, though usable directly by a user at a terminal, is designed mainly for use by programs.

First defined in RFC 172 written in June 1971, the protocol has been through several changes through to the current specification, which is defined in RFC 959. It's worth looking at its basic operation to get a better understanding of how content switching can improve performance and reliability in FTP environments.

Musical Instrument Digital Interface (MIDI)

The development of the MIDI system has been a major catalyst in the recent unprecedented explosion of music technology. MIDI has put powerful computer instrument networks and software in the hands of less technically versed musicians and amateurs and has provided new and time-saving tools for computer musicians. The system first appeared in 1982 following an agreement among manufacturers and developers of electronic musical instruments to include a common set of hardware connectors and digital codes in their instrument design. The original goal was to interface instruments of different manufacture to control common functions, such as note events, timing events, pitch bends, pedal information, etc. Though several classes of codes have been added to the MIDI 1.0 Specification (International MIDI Association, 1989) and MIDI applications have grown far beyond the original intent, the basic protocol has remained unchanged. MIDI is a system very much like a player piano roll in that it is used to specify the actions of a synthesizer or other electronic devices, while the tone or effect is generated by the instrument itself.

Data Logger

A data logger is an electronic instrument that records measurements (temperature, relative humidity, light intensity, on/off, open/closed, voltage, pressure and events) over time. Typically, data loggers are small, battery-powered devices that are equipped with a microprocessor, data storage and sensor. Most data loggers utilize turn-key software on a personal computer to initiate the logger and view the collected data.

Kruskal’s method of Spanning Tree and Comparing with Round Robin’s Algorithm.

Kruskal’s method of spanning Tree is one of the methods of creating minimum spanning tree. In this method nodes of the graph are initially considered as ‘ n’ distinct partial trees with one node each. Then two distinct partial trees are connected into a single partial tree by an edge of the graph. While connecting two distinct trees the arc of minimum weight should be used, for that arcs are placed in a priority queue on the basis of weight. Then the arc of lowest weight is examined to see if it connects two distinct trees. To determine if an arc (x, y) connects distinct trees, we can implement the trees with a father field in each node. Then we can traverse all ancestors of x and y to obtain the roots of the trees containing them. If the roots of the two trees are the same node, x and y are already in the same tree, so arc (x, y) is discarded, and the arc of next lowest weight is examined. Combining two trees simply involves setting the father of the root of one to the root of the other. This method requires O (e log e) operations
Round Robin’s algorithm is another method of spanning tree, which provides better performance when the number of edges is low. This algorithm is similar to Kruskal’s method except that there is a priority queue of arcs associated with each partial tree, rather than one global priority queue of all unexamined arcs. For this at first all partial trees are maintained in a queue,Q. Associated with each partial tree, T, is a priority queue ,P(T),of all arcs with exactly one incident node in the tree, ordered by the weights of the arcs. Then a priority queue of all arcs incident to ‘nd’ is created for each node ‘nd’, and the single–node trees are inserted into Q in arbitrary order. The algorithm proceeds by removing a partial tree,T1, from the front of Q; finding the minimum –weight arc a in P(T1);deleting from Q the tree ,T2,at the other end of arc a; combining T1 and T2 into a single new tree T3 [and at the same time combining P(T1) and P(T2),with a deleted ,into P(T3)];and adding T3 to the rear of Q. This continues until Q contains a single tree: the minimum spanning tree. This algorithm requires only O(e log n) operations if appropriate implementation of the priority queues is used .

There is a trade off between machine time and space required

In any process, generally there is a trade off between the machine time and the space required i.e. the improvement of machine time effect the space and vice versa. Taking sorting particularly, the amount of machine time necessary for running the sorting program and the amount of space necessary for the program are the main efficiency considerations of the sorting.
If a file or a program is small, then sophisticated sorting techniques can be designed in order to minimize the amount of space. But this makes the time requirements usually worse or marginally betters in achieving efficiencies than that of the simpler techniques, but generally less efficient techniques. Similarly, if a particular sorting program is to be run only once then the machine time will be sufficient but the space in which the program is to run would be ludicrous .
It would be difficult for the programmer to spend days investigating the best methods of obtaining the last ounce of efficiency considering these considerations. However the programmer must be able to recognize the fact that a particular sort is inefficient and must be able to justify its use in a particular situation. Many times the designers and the planners are surprised at the inadequacy of their creation. To maximize the techniques and be cognizant of the advantages and disadvantages of each, so that when the need for a sort arises the programmer can supply the one, which is most appropriate for the particular situation. This brings the considerations to the time and space while designing the sorting techniques.
In most of the computer applications, the programmer must often optimize either time or the space at the expense of the other. While considering the time required to sort a file of size n, the actual time units are not concerned, as these will vary from one machine to another. Instead, the corresponding change in the amount of time required to sort a file induced by a change in file size n is the matter of interest. This shows the relationship between the time and the space. Let us consider y is proportional to x such that multiplying x by a constant multiplies y by the same constant. Thus if y is proportional ton x, doubling x will double y, and multiplying x by 10 multiply y by 10.
There are different ways to determine the time requirements of a sort, neither of which yields that are applicable to all the cases. One is to go through a sometimes intricate and involved mathematical analysis of the various cases, the result of which is often a formula giving the average time required for a particular sort as a function of file size that indicate the space required.
Considering these issues, it can be noticed that there is a trade off between the machine time and the space both of which contribute equally to the efficiency of the process like sorting.

B-Tree and its application

A b-tree of order m is an m-way tree such that
-all leaves are on the same level
-all internal nodes except the root are constrained to have at most m non empty children and at least m/2 non empty children . The root has at most m non empty children.
A balanced search tree in which every node has between m/2 and m children, where m>1 is a fixed integer. m is the order. The root may have as few as 2 children. This is a good structure if much of the tree is in slow memory (disk), since the height, and hence the number of accesses, can be kept small, say one or two, by picking a large m.
Also known as balanced multiway tree.
A B-tree is essentially just a sorted list of all the item identifiers from one of your data files. For example, if you have a customer file, and every customer item in the file uses a customer number as the item identifier, and if you use B-TREE-P to create a B-tree for the customer file in ZIP code order, then the resulting B-tree will simply be a list of all the customer numbers sorted by ZIP code. However, the B-TREE-P subroutines keep the sorted B-tree list structured in a special fashion that makes it very fast and easy to find any number in the list.
Just as there has to be a file to contain customer data, there has to be a file to contain a B-tree. Naturally, a good convention (and one followed in the examples already presented) is to create a file called B-TREE for keeping the B-tree data that the B-TREE-P subroutines create. Initially, the B-TREE file is completely empty. Then, each time the BTPINS subroutine is called by a program, another item identifier is inserted into the B-TREE file, and the file becomes a specially sorted and constructed list of identifiers.
The order in which item identifiers are sorted in a B-tree is controlled by the BTPKEY subroutine. Although the statements in BTPKEY may specify a very complicated sort for controlling how items in a B-tree are ordered (for example, by ZIP code by address by company by name), the only data actually saved in a B-tree are item identifiers. Therefore, it doesn't matter how complicated the sort is, since the size of the resulting B-tree file is always the same. As a very rough rule of thumb, a B-tree for a file takes approximately the same amount of space as about two SELECT lists of the file.
The actual structure of a B-tree consists of a number of nodes stored as items in the B-tree's file. Each node contains a portion of the sorted list of identifiers in the B-tree, along with pointers to other nodes in the B-tree. The number of identifiers and pointers stored in each node is controlled by a special size parameter that is passed as the second argument to the BTPINS and BTPDEL subroutines. The size parameter indicates the minimum number of identifiers in a node, and also half the maximum. For example, in the examples already presented, the node size used was 5, so each node contained from 5 to 10 item identifiers.
The B-tree node size can be any number from 1 up. Small sizes create B-trees that may be faster to search, but take up more disk space because there are more nodes with pointers. Extremely small nodes may cause very "deep" B-trees that end up being slow to search. Larger node sizes slow down searches, but take less disk space because there are fewer pointers. The disk space occupied by nodes also depends on the length of your data file's item identifiers. A node size of 50 is often a good, all-around, starting value. Once the B-tree is built, it can be examined using standard Pick file maintenance techniques to find the optimum node size that keeps items in the B-tree file nicely packed within the boundaries of Pick's frame structure. If desired, the B-tree can then easily be rebuilt with the optimum node size.
The first node created in a B-tree file is numbered 0, the next is 1 (even though the node might be for a different B-tree in the same file), and the next node is 2, and so on. As more identifiers are inserted into the file's B-trees, more nodes are created. The special item named NEXT.ID, which is automatically created in every B-tree file, contains the number of the next node that will be created.
A file can contain any number of B-trees, but each B-tree in the file must have a unique name, which can be any string of characters. Each B-tree name is saved as an item in the B-tree file, and contains the number of the root node in the B-tree. The root node for a given B-tree is where all searches through that tree happen to start. In the B-TREE-P examples, the B-TREE file contained three different B-trees, named ZIP, COMP, and LNAME, so the B-TREE file also contained three items with those names.

ADT

A useful tool for specifying the logical properties of a data type is the ADT (Abstract Data Type).A data type is a collection of values and a set of operations on those values. The term “abstract data type” refers to the basic mathematical concept that defines the data type.

There are a numbers of methods for specifying an ADT.
The method that we use is semiformal and borrows heavily from C notation but extends those notations where necessary. The operations on real numbers that we define are the creation of a rational numbers from two integers, addition, multiplication, and testing for equality.
The following is an initial specification of this ADT.
/* value definitions */
abstract typedef RATIONAL;
condition RATIONAL[1]!=0;
/*operator definition*/
abstract RATIONAL make rational(a,b)
int a,b;
precondition b!=0;
post condition make rational[0]= =a;
make rational[1]= =b;

abstract RATIONAL add(a,b) /*written a+b*/
RATIONAL a,b;
Post condition add[1]= = a[1]*b[1];
add[0]= = a[0]*b[1] +b[0]*a[1];
abstract RATIONAL mult(a,b) /*written a*b*/
RATIONAL a,b;
Post condition mult[1]= = a[1]*b[1];
mult[0]= = a[0]*b[0];

abstract RATIONAL equal(a,b) /* written a= =b*/ RATIONAL a,b; Post condition equal= =(a[0]*b[1]= = a[1]*b[0]);

STACK AS AN ADT

A stack is an ordered collection of items of items into which new items may be inserted and from which items may be removed at one end called the top of stack. The representation of a stack as an abstract data type is straight forward. We use eltype to denote the type of the stack element and parameterize the stack data type with Eltype :
Abstract typedef<> STACK(eltype)
abstract empty(s)
STACK(eltype) s;
Post condition empty = = ( len(s)= =0);

abstract eltype pop(s)
STACK(eltype)s;
pre condition empty(s) = =FALSE;
post condition pop = = first(s)
s= = sub(s,1,len(s)-1);

abstract push(s,elt)
STACK(s,elt)
eltype elt;
post condition s = = +s;

QUEUE AS AN ADT

Abstract typedef<> queue(eltype)
abstract empty(q)
queue (eltype)q;
Post condition empty = = ( len(q)= =0);

abstract queue insert (q,elt)
queue eltype)q;
post condition q=p +

abstract queue remove(q,elt)
queue(empty)q;
pre condition empty(q)= = FALSE;
post condition remove = = first(q)
q = = sub(q,1,len(q)-1)

Data Structure and Algorithm

An organization of information, usually in memory, for better algorithm efficiency, such as queue, stack, linked list, heap, dictionary, and tree, or conceptual unity, such as the name and address of a person. It may include redundant information, such as length of the list or number of nodes in a sub tree. And algorithm is a computable set of steps to achieve a desired result.

It is important to study due to following facts:

1) To identify and develop useful mathematical entities and operation to determine what classes of problems can be solved by using these entities and operations.
2) To determine representations for those abstract entities and to implement the abstract operations on these concrete representations.
3) To save computer memory because the memory of the computer is limited.
4) To maintain the execution time of the program, which determines the efficiency

Software engineering

It is the technology which encompasses process, methods and tools.

Characteristics:-
1. Reliable
2. Reusable
3. Modifier
4. Modular (part division)
5. Easy to understand
6. Efficient
7. cost effective
8. timely delivery
9. useful (to fulfill the consumers’ needs)
10. good user interface


Characteristics of s/w in comparison to hardware

1. h/w is manufactured but s/w is developed or engineered
2. S/w doesn’t wear out as h/w dies.
Technology develops so there may be wearing out in s/w in order to fulfill the consumers’ needs.
3. h/w is regarded as component assembly but s/w is custom built.

Monday, December 3, 2007

Basic networking in computer

In the world of computers, networking is the practice of linking two or more computing devices together for the purpose of sharing data. Networks are built with a mix of computer hardware and computer software.

Area Networks

Networks can be categorized in several different ways. One approach defines the type of network according to the geographic area it spans. Local area networks (LANs), for example, typically reach across a single home, whereas wide area networks (WANs), reach across cities, states, or even across the world. The Internet is the world's largest public WAN.

Network Design

Computer networks also differ in their design. The two types of high-level network design are called client-server and peer-to-peer. Client-server networks feature centralized server computers that store email, Web pages, files and or applications. On a peer-to-peer network, conversely, all computers tend to support the same functions. Client-server networks are much more common in business and peer-to-peer networks much more common in homes.

A network topology represents its layout or structure from the point of view of data flow. In so-called bus networks, for example, all of the computers share and communicate across one common conduit, whereas in a star network, all data flows through one centralized device. Common types of network topologies include bus, star, ring and mesh.

Network Protocols

In networking, the communication language used by computer devices is called the protocol. Yet another way to classify computer networks is by the set of protocols they support. Networks often implement multiple protocols to support specific applications. Popular protocols include TCP/IP, the most common protocol found on the Internet and in home networks.

Wired vs Wireless Networking

Many of the same network protocols, like TCP/IP, work in both wired and wireless networks. Networks with Ethernet cables predominated in businesses, schools, and homes for several decades. Recently, however, wireless networking alternatives have emerged as the premier technology for building new computer networks