TCP/IP Sockets in Java: Practical Guide for Programmers. Kenneth L. Originally published as The Pocket Guide to TCP/IP Sockets: C Version ( ). Chapter 7, Using the C# Socket Helper Classes, discusses the C# TcpClient, . Chapter 1: The C# Language Chapter 2: IP Programming Basics Chapter 3: C#. TCP/IP Sockets in Java: Practical Guide for Programmers. Kenneth L. .. You are now ready to learn to write your own socket applications in C#. One of the.
|Language:||English, Spanish, Japanese|
|ePub File Size:||26.66 MB|
|PDF File Size:||16.82 MB|
|Distribution:||Free* [*Regsitration Required]|
Power up your mind: learn faster, work smarter / Bill Lucas. p. cm. ways in which you can power up your mind and impr. IP Addresses in C#. Using C# Sockets. C# Socket Exceptions. . C# Socket Helper Classes. TcpClient. TcpListener. UdpClient. TCP/IP Sockets in C#. TCP/IP Sockets in C# - 1st Edition - ISBN: , Price includes VAT/GST. DRM-free (EPub, PDF, Mobi).
See any decent networking text for the algorithm. The available method returns the number of bytes available for reading at the time it was called. Morgan Kaufmann. A host on an Ethernet network can send a message to all other hosts on that same Ethernet, but the message will not be forwarded by a router. We omitted Preface xiii many topics and directions, so it is probably worth mentioning some of the things this book is not:
NET framework is ever rising due to its ease of use, the extensive class libraries available in the. Key network programming concepts such as framing, performance and deadlocks are illustrated through hands-on examples. David Makofske has over ten years experience as a software engineer and consultant, with an emphasis on IP network and web development. He received his Masters degree in computer science from the University of California at Santa Barbara, and is currently a senior solutions architect at Akamai Technologies.
Michael J. Donahoo teaches networking to undergraduate and graduate students at Baylor University, where he is an assistant professor. He received his Ph. His research interests are in large-scale information dissemination and management. Kenneth L. Calvert is an associate professor at University of Kentucky, where he teaches and does research on the design and implementation of computer network protocols. He has been doing networking research since , and teaching since We are always looking for ways to improve customer experience on Elsevier.
We would like to ask you for a moment of your time to fill in a short questionnaire, at the end of your visit.
If you decide to participate, a new browser tab will open so you can complete the survey after you have completed your visit to this website.
Thanks in advance for your time. Skip to content. Search for books, journals or webpages All Webpages Books Journals. View on ScienceDirect. Paperback ISBN: Morgan Kaufmann. Published Date: Page Count: View all volumes in this series: The Practical Guides. Sorry, this product is currently out of stock.
Flexible - Read on multiple operating systems and devices. TCP and UDP are called end-to-end transport protocols because they carry data all the way from one program to another whereas IP only carries data from one host to another. TCP is designed to detect and recover from the losses, duplications, and other errors that may occur in the host-to-host channel provided by IP. TCP provides a reliable byte-stream channel so that applications do not have to deal with these problems.
It is a connectionoriented protocol: Introduction connection, which involves completing an exchange of handshake messages between the TCP implementations on the two communicating computers. Thus, applications that use UDP must be prepared to deal with losses, reordering, and so on. Before you can talk to someone on the phone, you must supply a phone number to the telephone system. In a similar way, before a program can communicate with another program, it must tell the network something to identify the other program.
Internet addresses are binary numbers. That may seem like a lot, but because of the way they are allocated, many are wasted. More than half of the total address space has already been allocated. For that reason, IPv6 was introduced. IPv6 addresses are bits long. IPv4 addresses are conventionally written as a group of four decimal numbers separated by periods e.
The four numbers in a dotted-quad string represent the contents of the four bytes of the Internet address—thus, each is a number between 0 and The sixteen bytes of an IPv6 address, on the other hand, are represented as groups of hexadecimal digits, separated by colons e.
Also, consecutive groups that contain only zeros may be omitted altogether but this can only be done once in any address. So the example above could be written as Technically, each Internet address refers to the connection between a host and an underlying communication channel—in other words, a network interface.
A host may have several interfaces; it is not uncommon, for example, for a host to have connections to both wired 1. However, the converse is not true, because a single host can have multiple interfaces, and each interface can have multiple addresses.
In fact, the same interface can have both IPv4 and IPv6 addresses. Returning to our earlier analogies, a port number corresponds to a room number at a given street address, say, that of a large building.
The postal service uses the street address to get the letter to a mailbox; whoever empties the mailbox is then responsible for getting the letter to the proper room within the building. Or consider a company with an internal telephone system: Port numbers are bit unsigned binary numbers, so each one is in the range 1 to 65, One of these that is worth knowing is the loopback address, which is always assigned to a special loopback interface, a virtual device that simply echoes transmitted packets right back to the sender.
The loopback interface is very useful for testing because packets sent to that address are immediately returned back to the destination.
Moreover, it is present on every host, and can be used even when a computer has no other interfaces i. The loopback address for IPv4 is There is no corresponding class for IPv6.
These addresses were originally designated for use in private networks that are not part of the global Internet.
Such a device acts like a router that translates rewrites the addresses and ports in packets as it forwards them. More precisely, it maps private address, port pairs in packets on one of its interfaces to public address, port pairs on the other interface. This enables a small group of hosts e. The importance of these addresses is that they cannot be reached from the global Internet. If you are trying out the code in this book on a machine that has an address in the private-use class, and you are trying to communicate with another host that does not have one of these addresses, typically you will only succeed if the host with the private address initiates communication.
For IPv4, such addresses begin with Introduction is a link-local address. Such addresses can only be used for communication between hosts connected to the same network; routers will not forward them.
Finally, another class consists of multicast addresses. In IPv6, multicast addresses start with FF. However, the Internet protocols deal with addresses binary numbers , not names. When you use a name to identify a communication endpoint, the system does some extra work to resolve the name into an address. This extra step is often worth it for a couple of reasons.
First, names are obviously easier for humans to remember than dotted-quads or, in the case of IPv6, strings of hexadecimal digits. Second, names provide a level of indirection, which insulates users from IP address changes. The name-resolution service can access information from a wide variety of sources.
The DNS  is a distributed database that maps domain names such as www. Internet communication is similar. The terms client and server refer to these roles: The terms client and server are descriptive of the typical situation in which the server makes a particular capability—for example, a database service—available to any client that is able to communicate with it.
Whether a program is acting as a client or server determines the general form of its use of the sockets API to establish communication with its peer. The client is the peer of the server and vice versa.
This is analogous to a telephone call—in order to be called, a person does not need to know the telephone number of the caller. As with a telephone call, once the connection is established, the distinction between server and client disappears. Usually, the client knows the name of the server it wants—for example, from a Universal Resource Locator URL such as http: In principle, servers can use any port, but the client must be able to learn what it is. In the Internet, there is a convention of assigning well-known port numbers to certain applications.
A list of all the assigned port numbers is maintained by the numbering authority of the Internet see http: A socket allows an application to plug in to the network and communicate with other applications that are plugged in to the same network. Stream sockets use TCP as the end-to-end protocol with IP underneath and thus provide a reliable byte-stream service. Stream and datagram sockets are also supported by other protocol suites, but 8 Chapter 1: Sockets, protocols, and ports. As you proceed, you will encounter several ways for a socket to become bound to an address.
Note that a single socket abstraction can be referenced by multiple application programs. Each program that has a reference to a particular socket can communicate through that socket. From Figure 1. In practice, separate programs that access the same socket would usually belong to the same application e. How many support two-way transport?
TCP hides all of this, providing a reliable service that takes and delivers an unbroken stream of bytes. We begin by demonstrating how Java applications identify network hosts using the InetAddress and SocketAddress abstractions.
In the latter case the name must be resolved to a numerical address before it can be used for communication.
For each Java networking class described in this text, we include only the most important and commonly used methods, omitting those that are deprecated or beyond the use of our target audience.
However, this is something of a moving target. For example, the number of methods provided by the Socket class grew from 23 to 42 between version 1. Basic Sockets The InetAddress abstraction represents a network destination, encapsulating both names and numerical address information. The class has two subclasses, Inet4Address and Inet6Address, representing the two versions in use. Instances of InetAddress are immutable: To get the addresses of the local host, the program takes advantage of the Network Interface abstraction.
Recall that IP addresses are actually assigned to the connection between a host and a network and not to the host itself. This is extremely useful, for example when a program needs to inform another program of its address. Enumeration; import java. Check for empty list: Get and print address es of each interface in the list: At this time the only subtypes of InetAddress are those listed, but conceivably there might be others someday.
The 12 Chapter 2: Catch exception: Get names and addresses for each command-line argument: Address v4: Unable to find address for blah. Link-local IPv6 addresses begin with fe8. You may also have noticed a delay when resolving blah. Your resolver looks in several places before giving up on resolving a name. When the name service is not available for some reason—say, the program is running on a machine that is not connected to any network— attempting to identify a host by name may fail.
It is, therefore, good to know that you can always refer to a host using the IP address in dottedquad notation. If you can ping a host using one of its names e. If your ping test fails or 2. See also the isReachable method of InetAddress, discussed below. For numeric IPv6 addresses, the shorthand forms described in Chapter 1 may be used.
A name may be associated with more than one numeric address; the getAllByName method returns an instance for each address associated with a name. The getAddress method returns the binary form of the address as a byte array of appropriate length. If the instance is of Inet4Address, the array is four bytes in length; if of Inet6Address, it is 16 bytes.
As we have seen, an InetAddress instance may be converted to a String representation in several ways. The numeric representation of the address only is returned by getHostAddress. For an IPv6 address, the string representation always includes the full eight groups i. Both methods return the numerical form of the address if resolution cannot be completed.
Also, both check permission with the security manager before sending any messages. They all work for both IPv4 and IPv6 addresses. The fourth method checks whether it is a multicast address see Section 4. The scope determines, roughly, how far packets addressed to that destination can travel from their origin. Note that, unlike the other methods, which involve simple syntactic checks, these methods cause the networking system to take action, namely 2.
The TTL limits the distance a packet can travel through the network. The NetworkInterface class provides a large number of methods, many of which are beyond the scope of this book. We describe here the most useful ones for our purposes. Similarly, the list of addresses may contain linklocal addresses that also are not globally reachable. The getName methods return the name of the interface not the host.
This generally consists of an alphabetic string followed by a numeric part, for example eth0. The loopback interface is named lo0 on many systems. Socket and ServerSocket. An instance of Socket represents one end of a TCP connection. An instance of ServerSocket listens for TCP connection requests and creates a new Socket instance to handle each incoming connection.
Thus, servers handle both ServerSocket and Socket instances, while clients use only Socket.
We begin by examining an example of a simple client. The typical TCP client goes through three steps: Construct an instance of Socket: Close the connection using the close method of Socket.
An echo server simply repeats whatever it receives back to the client. The string to be echoed is provided as a command-line argument to our client. Some systems include an echo server for debugging and testing purposes. You may be able to use a program such as telnet to test if the standard echo server is running on your system e. Socket; java. SocketException; java. IOException; java. InputStream; java. Application setup and parameter parsing: The getBytes method of String returns a byte array representation of the string.
See Section 3. If we specify a third parameter, Integer. TCP socket creation: Note that the underlying TCP 18 Chapter 2: Basic Sockets deals only with IP addresses; if a name is given, the implementation resolves it to the corresponding address. If the connection attempt fails for any reason, the constructor throws an IOException. Get socket input and output streams: We send data over the socket by writing bytes to the OutputStream just as we would any other stream, and we receive by reading from the InputStream.
Send the string to echo server: Receive the reply from the echo server: This particular form of read takes three parameters: For the client, this indicates that the server prematurely closed the socket. Why not just a single read? TCP does not preserve read and write message boundaries. That is, even though we sent the echo string with a single write , the echo server may receive it in multiple chunks. Even if the echo string is handled in one chunk by the echo server, the reply may still be broken into pieces by TCP.
One of the most common errors for beginners is the assumption that data sent by a single write will always be received in a single read. Print echoed string: Close socket: We can communicate with an echo server named server.
Echo this! Specifying the local address may be useful on a host with multiple interfaces. String arguments that specify destinations can be in the same formats that are accepted by the InetAddress creation methods. The last constructor creates an unconnected socket, which must be explicitly connected via the connect method, see below before it can be used for communication.
Stream methods. Any subsequent attempt to read from the socket will cause an exception to be thrown. See Section 4. By default, Socket is implemented on top of a TCP connection; however, in Java, you can actually change the underlying implementation of Socket. The Socket class actually has a large number of other associated attributes referred to as socket options.
Because they are not necessary for writing basic applications, we postpone introduction of them until Section 4. The constructor that takes a string hostname attempts to resolve the name to an IP address; the 2. The isUnresolved method returns true if the instance was created this way, or if the resolution attempt in the constructor failed.
The get If the InetSocketAddress is unresolved, only the String with which it was created precedes the colon. The typical TCP server goes through two steps: Construct a ServerSocket instance, specifying the local port. Call the accept method of ServerSocket to get the next incoming client connection. Upon establishment of a new client connection, an instance of Socket for the new connection is created and returned by accept.
The server is very simple. It runs forever, repeatedly accepting a connection, receiving and echoing bytes until the connection is closed by the client, and then closing the client socket. Basic Sockets if args. We are done with this client! Server socket creation: Loop forever, iteratively handling incoming connections: If a connection arrives between the time the server socket is constructed and the call to accept , the new connection is queued, and in that case accept returns immediately.
See Section 6. The accept method of ServerSocket returns an instance of Socket 2. The name part is empty because the instance was created from the address information only. The read method of InputStream fetches up to the maximum number of bytes the array can hold in this case, BUFSIZE bytes into the byte array receiveBuf and returns the number of bytes read.
In fact, it can return after having read only a single byte. In this case, 0 indicates to take bytes starting from the front of data. Valid port numbers are in the range 0—65, Optionally, the size of the connection queue and the local address can also be set. Note that the maximum queue size may not be a hard limit, and cannot be used to control client population. This may be useful for hosts with multiple interfaces where the server wants to accept connections on only one of its interfaces.
The fourth constructor creates a ServerSocket that is not associated with any local port; it must be bound to a port see bind below before it can be used. Operation void bind int port void bind int port, int queuelimit Socket accept void close The bind methods associate this socket with a local port.
A ServerSocket can only be associated with one port. If no established connection is waiting, accept blocks until one is established or a timeout occurs. The close method closes the socket.
After invoking this method, incoming client connection requests for this socket are rejected. It does, however, have other attributes called options, which can be controlled via various methods, as described in Section 4. The NIO facilities, added in Java 1. A stream is simply an ordered sequence of bytes. Java input streams support reading bytes, and output streams support writing bytes.
When we write to the output stream of a Socket, the bytes can eventually be read from the input stream of the Socket at the other end of the connection. OutputStream is the abstract superclass of all output streams in Java. Operation abstract void write int data void write byte[ ] data void write byte[ ] data, int offset, int length void flush void close The write methods transfer to the output stream a single byte, an entire array of bytes, and the bytes in an array beginning at offset and continuing for length bytes, respectively.
The single-byte method writes the low-order eight bits of the integer argument. These operations, if called on a stream associated with a TCP socket, may block if a lot of data has been sent, but the other end of the connection has not called read on the associated input stream recently.
This can have undesirable consequences if some care is not used see Section 6. The close method terminates the stream, after which further calls to write will throw an exception. InputStream is the abstract superclass of all input streams. Using an InputStream, we can read bytes from and close the input stream. Operation abstract int read int read byte[ ] data int read byte[ ] data, int offset, int length int available void close 26 Chapter 2: The second form transfers up to data.
The third form does the same, but places data in the array beginning at offset, and transfers only up to length bytes.
If no data is available, but the end-of-stream has not been detected, all the read methods block until at least one byte can be read.
The available method returns the number of bytes available for reading at the time it was called. In fact, UDP performs only two functions: For example, UDP sockets do not have to be connected before being used. Similarly, each message—called a datagram—carries its own address information and is independent of all others.
UDP sockets preserve them. This makes receiving an application message simpler, in some ways, than it is with TCP sockets. This is discussed further in Section 2. A program using UDP sockets must therefore be prepared to deal with loss and reordering.
Both clients and servers use DatagramSockets to send and receive DatagramPackets. To send, a Java program constructs a DatagramPacket instance containing the data to be sent and passes it as an argument to the send method of a DatagramSocket. In addition to the data, each instance of DatagramPacket also contains address and port information, the semantics of which depend on whether the datagram is being sent or received.
When a DatagramPacket is sent, the address and port identify the destination; for a received DatagramPacket, they identify the source of the received message. See the following reference and Section 2. Creation DatagramPacket byte[ ] DatagramPacket byte[ ] DatagramPacket byte[ ] DatagramPacket byte[ ] DatagramPacket byte[ ] DatagramPacket byte[ ] data, data, data, data, data, data, int int int int int int length offset, int length length, InetAddress remoteAddr, int remotePort offset, int length, InetAddress remoteAddr, int remotePort length, SocketAddress sockAddr offset, int length, SocketAddress sockAddr These constructors create a datagram whose data portion is contained in the given byte array.
The last four forms are typically used to construct DatagramPackets for sending. The internal datagram length can be set explicitly either by the constructor or by the setLength method. The receive method of DatagramSocket uses the internal length in two ways: There is no setOffset method; however, it can be set with setData. The getData method returns the byte array associated with the datagram. The returned object is a reference to the byte array that was most recently associated with this DatagramPacket, either by the constructor or by setData.
The setData methods make the given byte array the data portion of the datagram. The typical UDP client goes through three steps: Construct an instance of DatagramSocket, optionally specifying the local address and port. Communicate by sending and receiving instances of DatagramPacket using the send and receive methods of DatagramSocket. A UDP echo server simply sends each datagram that it receives back to the client.
Many systems include a UDP echo server for debugging and testing purposes. One consequence of using UDP is that datagrams can be lost. In the case of our echo protocol, either the echo request from the client or the echo reply from the server may be lost in the network. Recall that our TCP echo client sends an echo string and then blocks on read waiting for a reply. If we try the same strategy with our UDP echo client and the echo request datagram is lost, our client will block forever on receive.
To avoid this problem, our client 30 Chapter 2: Basic Sockets uses the setSoTimeout method of DatagramSocket to specify a maximum amount of time to block on receive , so it can try again by resending the echo request datagram. Our echo client performs the following steps: Send the echo string to the server. Terminate the client.
DatagramSocket; java. DatagramPacket; java. InetAddress; java. Application setup and parameter processing: UDP socket creation: We do not specify a local address or port so some local address and available port will be selected. We could explicitly set them with the setLocalAddress and setLocalPort methods or in the constructor, if desired.
Set the socket timeout: Here we set the timeout to three seconds. Note that timeouts are not precise: Create datagram to send: For the destination address, we may identify the echo server either by name or IP address.
If we specify a name, it is converted to the actual IP address in the constructor. Basic Sockets 5. Create datagram to receive: Send the datagram: Timer expiration is indicated by an InterruptedIOException.
If the timer expires, we increment the send attempt count tries and start over. After the maximum number of tries, the while loop exits without receiving a datagram. Print reception results: Close the socket: A socket connected to a multicast or broadcast address can only send datagrams because a datagram source address is always a unicast address see Section 4.
Note that connecting is strictly a local operation because unlike TCP there is no end-to-end packet exchange involved. The close method indicates that the socket is no longer in use; further attempts to send or receive throw an exception. The third method returns both address and port conveniently encapsulated in an instance of SocketAddress, or null if unconnected.
The last three methods provide the same service for the local address and port. The getLocalSocketAddress returns null if the socket is not bound.
Sending and receiving void send DatagramPacket packet void receive DatagramPacket packet 34 Chapter 2: Basic Sockets The send method sends the DatagramPacket. Otherwise, the packet is sent to the destination indicated by the DatagramPacket. This method does not block. The receive method blocks until a datagram is received, and then copies its data into the given DatagramPacket. If the socket is connected, the method blocks until a datagram is received from the remote socket to which it is connected.
Options int getSoTimeout void setSoTimeout int timeoutMillis These methods return and set, respectively, the maximum amount of time that a receive call will block for this socket. If the timer expires before data is available, an InterruptedIOException is thrown. The timeout value is given in milliseconds. They are described more fully in Section 4.
The typical UDP server goes through three steps: Construct an instance of DatagramSocket, specifying the local port and, optionally, the local address. The server is now ready to receive datagrams from any client. Receive an instance of DatagramPacket using the receive method of DatagramSocket.
Communicate by sending and receiving DatagramPackets using the send and receive methods of DatagramSocket. The server is very simple: IOException; import java. DatagramPacket; import java. Create and set up datagram socket: Create datagram: This datagram will be used both to receive the echo request and to send the echo reply.
Basic Sockets 4. Iteratively handle incoming echo requests: If we do not reset the internal length before receiving again, the next message will be truncated if it is longer than the one just received. Each call to receive on a DatagramSocket returns data from at most one call to send. This is covered in more detail in Chapter 6. This means that by the time a call to send returns, the message has been passed to the underlying channel for transmission and is or soon will be on its way out the door.
With a connected TCP socket, all received-but-not-yet-delivered bytes are treated as one continuous sequence of bytes see Chapter 6. A call to receive will never return more than one message. The remaining bytes are quietly discarded, with no indication to the receiving program that information has been lost!
This technique will guarantee that no data will be lost. The maximum amount of data that can be transmitted in a DatagramPacket is 65, bytes—the largest payload that can be carried in a UDP datagram. For example, suppose buf is a byte array of size 20, which has been initialized so that each byte contains its index in the array: The message is received into dg: One possibility is to copy the received data into a separate byte array, like this: Basic Sockets As of Java 1.
We said that a socket must have a port for communication, yet we do not specify a port in TCPEchoClient. What happens if a TCP server never calls accept? What happens if a TCP client sends data on a socket that has not yet been accept ed at the server?
Servers are supposed to run for a long time without stopping—therefore, they must be designed to provide good service no matter what their clients do. What is happening? Note that the response could vary by OS. What happens? Verify experimentally the size of the largest message you can send and receive using a DatagramPacket.
There is no magic: This agreement regarding the form and meaning of information exchanged over a communication channel is called a protocol; a protocol used in implementing a particular application is an application protocol. In our echo example from the earlier chapters, the application protocol is trivial: Because in most real applications the behavior of clients and servers depends upon the information they exchange, application protocols are usually somewhat more complicated.
So from now on we consider messages to be sequences of bytes. Given this, it may be helpful to think of a transmitted message as a sequence or array of numbers, each between 0 and That corresponds to the range of binary values that can be encoded in 8 bits: Sending and Receiving Data When you build a program to exchange information via sockets with other programs, typically one of two situations applies: We have seen that bytes of information can be transmitted through a socket by writing them to an OutputStream associated with a Socket or encapsulating them in a DatagramPacket which is then sent via a DatagramSocket.
However, the only data types to which these operations can be applied are bytes and arrays of bytes. As a strongly typed language, Java requires that other types—int, String, and so on—be explicitly converted to byte arrays. Fortunately, the language has built-in facilities to help with such conversions. We saw one of these in Section 2. Using that ability, we can encode the values of other larger primitive integer types. One is the size in bytes of each integer to be transmitted. For example, an int value in a Java program is represented as a bit quantity.
We can therefore transmit the value of any variable or constant of type int using four bytes. Values of type short, on the other hand, are represented using 16 bits and so only require two bytes to transmit, while longs are 64 bits or eight bytes.
We need a total of 15 bytes: Not quite. For types that require more than one byte, we have to answer the question of which order to send the bytes in. There are two obvious choices: Note that the ordering of bits within bytes is, fortunately, handled by the implementation in a standard way.
Consider the long value L. Its bit representation in hexadecimal is 0xFB1. If we transmit the bytes in big-endian order, the sequence of decimal byte values will look like this: One last detail on which the sender and receiver must agree: Chapter 5 contains further details about this class. Sending and Receiving Data it represents 4, , , Because Java does not support unsigned integer types, encoding and decoding unsigned numbers in Java requires a little care.
Assume for now that we are dealing with signed integer types. So how do we get the correct values into the byte array of the message? The program BruteForceCoding. If we encode at the sender, we must be able to decode at the receiver.
SIZE; Short. SIZE; Integer. SIZE; Long. Untested preconditions e. Data items to encode: Numbers of bytes in Java integer primitives: BYTEMASK keeps the byte value from being sign-extended when it is converted to an int in the call to append , thus rendering it as an unsigned integer.
Sending and Receiving Data 4. The resulting value is then cast to the type byte, which throws away all but the low-order eight bits, and placed in the array at the appropriate location. This is iterated over size bytes of the given value, val.
Demonstrate methods: If we place the return value into a long, it simply becomes the last byte of a long, producing a value of Which answer is correct depends on your application. If you expect a signed value from decoding N bytes, you must place the long result in a primitive integer type that uses exactly N bytes.
Can you name any others? Running the program produces output showing the following decimal byte values: It would be even worse if the encodeIntBigEndian method were not factored out as a separate method. For that reason, it is not the recommended approach, because Java provides some built-in mechanisms that are easier to use. Note that it does have 3.
The ByteArrayOutputStream class takes the sequence of bytes written to a stream and converts it to a byte array. The code for building our message looks like this: So much for the sending side. How does the receiver recover the transmitted values?
Finally, essentially everything in this subsection applies also to the BigInteger class, which supports arbitrarily large integers. However, this defeats the purpose of using a BigInteger, which can be arbitrarily large. Text is convenient because humans are accustomed to dealing with all kinds of information represented as strings of characters in books, newspapers, and on computer displays. Thus, once we know how to encode text for transmission, we can send almost any other kind of data: Obviously we can represent numbers and boolean values as Strings—for example "", "6.
Alas, there is more to it than that. In fact every String instance corresponds to a sequence array of characters type char[ ]. A char value in Java is represented internally as an integer.
The character "X" corresponds to 88, and the symbol "! A mapping between a set of symbols and a set of integers is called a coded character set. ASCII maps the letters of the English alphabet, digits, punctuation and some other special non-printable symbols to integers between 0 and It has been used for data transmission since the s, and is used extensively in application protocols such as HTTP the protocol used for the World Wide Web , even today.
Java therefore uses an international standard coded character set called Unicode to represent values of type char and String. So sender and receiver have to agree on a mapping from symbols to integers in order to communicate using text messages. Is that all they need to agree on? It depends. For a small set of characters with no integer value larger than , nothing more is needed because each character can be encoded as a single byte. For a code that may use larger integer values that require more than a single byte to represent, there is more than one way to encode those values on the wire.
Thus, sender and receiver need to agree on how those integers will be represented as byte sequences—that is, an encoding scheme. The combination of a coded character set and a character encoding scheme is called a charset see RFC Java provides support for the use of arbitrary charsets, and every implementation is required to support at least the following: When you invoke the getBytes method of a String instance, it returns a byte array containing the String encoded according to the default charset for the platform.
To ensure that a string is encoded using a particular charset, you simply supply the name of the charset as a String argument to the getBytes method. The resulting byte array contains the representation of the string in the given encoding. If you call "Test! From "Test!
The easiest way for them to do that is to simply specify one of the standard charsets. Encoding Booleans Bitmaps are a very compact way to encode boolean information, which is often used in protocols. The idea of a bitmap is that each of the bits of an integer type can encode one boolean value— typically with 0 representing false, and 1 representing true. In general, the int value that has a 1 in bit position i, and a zero in all other bit positions, is just 2i.
So bit 5 is represented by 32, bit 12 by , etc. Here are some example mask declarations: Sending and Receiving Data To clear a particular bit, bitwise-AND it with the bitwise complement of the mask for that bit which has ones everywhere except the particular bit, which is zero. We can then wrap that instance in a DataOutputStream to send primitive data types. We would code this composition as follows: Stream composition. Table 3. Framing refers to the problem of enabling the receiver to locate the beginning and end of a message.
Whether information is encoded as text, as multibyte binary numbers, or as some combination of the two, the application protocol must specify how the receiver of a message can determine when it has received all of the message. Of course, if a complete message is sent as the payload of a DatagramPacket, the problem is trivial: For messages sent over TCP sockets, however, the situation can be more complicated because TCP has no notion of message boundaries.
However, when the message can vary in length—for example, if it contains some variable-length arbitrary text strings—we do not know beforehand how many bytes to read. If a receiver tries to receive more bytes from the socket than were in the message, one of two things can happen.
If no other message is in the channel, the receiver will block and be prevented from processing the message; if the sender is also blocked waiting for a reply, the result will be deadlock.
Therefore framing is an important consideration when using TCP sockets. However, it is simplest, and also leads to the cleanest code, if you deal with these two problems separately: Here we focus on framing complete messages. The end of the message is indicated by a unique marker, an explicit byte sequence that the sender transmits immediately following the data.
The marker must be known not to occur in the data. A special case of the delimiter-based method can be used for the last message sent on a TCP connection: After the receiver reads the last byte of the message, it receives an end-of-stream indication i.
The delimiter-based approach is often used with messages encoded as text: The receiver simply scans the input as characters looking for the delimiter sequence; it returns the character string preceding the delimiter.
The downside of such techniques is that both sender and receiver have to scan the message. The upper bound on the message length determines the number of bytes required to encode the length: It has two methods: The nextMsg method scans the stream until it reads the delimiter, then returns everything up to the delimiter; null is returned if the stream is empty. ByteArrayOutputStream; java. EOFException; java. The class LengthFramer.
The sender determines the length of the given message and writes it to the output stream as a two-byte, big-endian integer, followed by the complete message.
On the receiving side, we use a DataInputStream to be able to read the length as an integer; the readFully method blocks until the given array is completely full, which is exactly what we need here.
Note that, with this framing method, the sender does not have to inspect the content of the message being framed; it needs only to check that the message does not exceed the length limit. DataInputStream; java. Note that this value is too big to store in a short, so we write it a byte at a time. RMI lets you invoke methods on different Java virtual machines, hiding all the messy details of argument encoding and decoding.
Serialization handles conversion of actual Java objects to byte sequences for you, so you can transfer actual instances of Java objects between virtual machines. These capabilities might seem like communication Nirvana, but in reality they are not always the best solution, for several reasons. For example, the serialized form of an object generally includes information that is meaningless outside the context of the Java Virtual Machine JVM.
Voting protocol. Sending and Receiving Data a candidate ID, which is an integer between 0 and Two types of requests are supported. An inquiry asks the server how many votes have been cast for the given candidate. The server sends back a response message containing the original candidate ID and the vote total as of the time the request was received for that candidate.
A voting request actually casts a vote for the indicated candidate. The server again responds with a message containing the candidate ID and the vote total which now includes the vote just cast. For our simple example, the messages sent by client and server are very similar.
In this case, we can get away with a single class for both kinds of messages. The VoteMsg. A VoteMsgCoder provides the methods for vote message serialization and deserialization. Our purpose here is to emphasize that the abstract representation is independent of the details of the encoding. Then comes the candidate ID, followed by the vote count, both encoded as decimal strings. ByteArrayInputStream; java. InputStreamReader; java.
This illustrates a very important point about implementing protocols: In this case, the fromWire method throws an exception if the expected string is not present.
This little bit of redundancy provides the receiver with a small degree of assurance that it is receiving a proper voting message. The second byte of the message always contains zeros, and the third and fourth bytes contain the candidateID. DataOutputStream; java. The encoding method takes advantage of the fact that the high-order two bytes of a valid candidateID are always zero. Note also the use of bitwise-or operations to encode the booleans using a single bit each.
Receiving, of course, does things in the opposite order. We begin by implementing a service for use by vote servers. When a vote server receives a vote message, it handles the request by calling the handleRequest method of VoteService. HashMap; import java. Create map of candidate ID to vote count: For votes, the incremented vote count is stored back in the map. If the candidate ID does not already exist in the map, set the count to 0.
OutputStream; import java. Socket; 3. Process arguments: Create socket, get output stream: Create binary coder and length-based framer: We elect to use a binary encoder for our protocol. Next, since TCP is a stream-based service, we need to provide our own framing. Create and send messages: Get and parse responses: Here the server repeatedly accepts a new client connection and uses the VoteService to generate responses to the client vote messages.
ServerSocket; import java. Establish coder and vote service for server: Repeatedly accept and handle client connections: Encode, frame, and send the returned response message. For UDP, we use the text encoding for our messages; however, this can be easily changed, as long as client and server agree.
Setup DatagramSocket and connect: Create vote and coder: Send request to the server: Receive, decode, and print server response: Of course, when we decode the datagram, we only use the actual bytes from the datagram so we use Arrays.
Repeatedly accept and handle client vote messages: We saw examples of both text-oriented and binary-encoded protocols. It is probably worth reiterating something we said in the Preface: That takes a great deal of experience.