Archive for December, 2007

Web design - 1498 Unicode (on CD) Appendix K Computers

Monday, December 31st, 2007

1498 Unicode (on CD) Appendix K Computers process data by converting characters to numeric values. For instance, the character a is converted to a numeric value so that a computer can manipulate that piece of data. Localization of global software requires significant modifications to the source code, which results in the increased cost and delays releasing the product. Localization is necessary with each release of a version. By the time a software product is localized for a particular market, a newer version, which needs to be localized as well, is ready for distribution. As a result, it is cumbersome and costly to produce and distribute global software products in a market where there is no universal character encoding standard. The Unicode Consortium developed the Unicode Standard in response to the serious problems created by multiple character encodings and the use of those encodings. The Unicode Standard facilitates the production and distribution of localized software. It outlines a specification for the consistent encoding of the world s characters and symbols. Software products which handle text encoded in the Unicode Standard need to be localized, but the localization process is simpler and more efficient because the numeric values need not be converted. The Unicode Standard is designed to be universal, efficient, uniform and unambiguous. A universal encoding system encompasses all commonly used characters; an efficient encoding system parses text files easily; a uniform encoding system assigns fixed values to all characters; and a unambiguous encoding system represents the same character for any given value. Unicode extends the limited ASCII character set to include all the major characters of the world. Unicode makes use of three Unicode Transformation Formats (UTF): UTF-8, UTF-16 and UTF32, each of which may be appropriate for use in different contexts. UTF-8 data consists of 8-bit bytes (sequences of one, two, three or four bytes depending on the character being encoded) and is well suited for ASCII-based systems when there is a predominance of one-byte characters (ASCII represents characters as one-byte). UTF-8 is a variable width encoding form that is more compact for text involving mostly Latin characters and ASCII punctuation. UTF-16 is the default encoding form of the Unicode Standard. It is a variable width encoding form that uses 16-bit code units instead of bytes. Most characters are represented by a single 16-bit unit, but some characters require surrogate pairs. Without surrogate pairs, the UTF-16 encoding form can only encompass 65,000 characters, but with the surrogate pairs, this is expanded to include over a million characters. UTF-32 is a 32-bit encoding form. The major advantage of the fixed-width encoding form is that it uniformly expresses all characters, so that they are easy to handle in arrays and so forth. The Unicode Standard consists of characters. A character is any written component that can be represented by a numeric value. Characters are represented using glyphs, which are various shapes, fonts and sizes for displaying characters. Code values are bit combinations that represent encoded characters. The Unicode notation for a code value is U+yyyy in which U+ refers to the Unicode code values, as opposed to other hexadecimal values. The yyyy represents a four-digit hexadecimal number. Currently, the Unicode Standard provides code values for 94,140 character representations. An advantage of the Unicode Standard is its impact on the overall performance of the international economy. Applications that conform to an encoding standard can be processed easily by computers. Another advantage of the Unicode Standard is its portability. Applications written in Unicode can be easily transferred to different operating systems, databases, Web browsers, etc. Most companies currently support, or are planning to support Unicode.
Note: In case you are looking for affordable and reliable webhost to host and run your j2ee application check Vision J2ee Web Hosting services.

Space web hosting - Appendix K Unicode (on CD) 1497 K.7 Character

Sunday, December 30th, 2007

Appendix K Unicode (on CD) 1497 K.7 Character Ranges The Unicode Standard assigns code values, which range from 0000 (Basic Latin) to E007F(Tags), to the written characters of the world. Currently, there are code values for 94,140 characters. To simplify the search for a character and its associated code value, the Unicode Standard generally groups code values by script and function (i.e., Latin characters are grouped in a block, mathematical operators are grouped in another block, etc.). As a rule, a script is a single writing system that is used for multiple languages (e.g., the Latin script is used for English, French, Spanish, etc.). The Code Charts page on the Unicode Consortium Web site lists all the defined blocks and their respective code values. Figure K.4 lists some blocks (scripts) from the Web site and their range of code values. SUMMARY Before Unicode, software developers were plagued by the use of inconsistent character encoding (i.e., numeric values for characters). Most countries and organizations had their own encoding systems, which were incompatible. A good example is the individual encoding systems on the Windows and Macintosh platforms. Script Range of Code Values Arabic U+0600 U+06FF Basic Latin U+0000 U+007F Bengali (India) U+0980 U+09FF Cherokee (Native America) U+13A0 U+13FF CJK Unified Ideographs (East Asia) U+4E00 U+9FAF Cyrillic (Russia and Eastern Europe) U+0400 U+04FF Ethiopic U+1200 U+137F Greek U+0370 U+03FF Hangul Jamo (Korea) U+1100 U+11FF Hebrew U+0590 U+05FF Hiragana (Japan) U+3040 U+309F Khmer (Cambodia) U+1780 U+17FF Lao (Laos) U+0E80 U+0EFF Mongolian U+1800 U+18AF Myanmar U+1000 U+109F Ogham (Ireland) U+1680 U+169F Runic (Germany and Scandinavia) U+16A0 U+16FF Sinhala (Sri Lanka) U+0D80 U+0DFF Telugu (India) U+0C00 U+0C7F Thai U+0E00 U+0E7F Fi Fig. K.4Some character ranges.
Note: If you are looking for cheap and reliable webhost to host and run your mysql application check mysql web server services.

Free web hosts - 1496 Unicode (on CD) Appendix K 63 spanish

Sunday, December 30th, 2007

1496 Unicode (on CD) Appendix K 63 spanish = new JLabel( “u0042u0069u0065u006Eu0076″ + 64 “u0065u006Eu0069u0064u0061u0020u0061u0020″ + 65 “Unicodeu0021″ ); 66 spanish.setToolTipText( “This is Spanish” ); 67 container.add( spanish ); 68 69 } // end Unicode constructor 70 71 // execute application 72 public static void main( String args[] ) 73 { 74 Unicode application = new Unicode(); 75 application.setDefaultCloseOperation( 76 JFrame.EXIT_ON_CLOSE ); 77 application.pack(); 78 application.setVisible( true ); 79 80 } // end method main 81 82 } // end class Unicode Fig. K.3Java program that uses Unicode encoding (part 3 of 3). Fig. K. The Unicode.java program uses escape sequences to represent characters. An escape sequence is in the form uyyyy, where yyyy represents the four-digit hexadecimal code value. Lines 24 and 25 contain the series of escape sequences necessary to print Welcome to Unicode! in English. The first escape sequence (u0057) equates to the character W, the second escape sequence (u0065) equates to the character e, and so on. The u0020 escape sequence (line 25) is the encoding for the space character. The u0074and u006F escape sequences equate to the word to. Note that Unicode is not encoded because it is a registered trademark and has no equivalent translation in most languages. Line 25 also contains the u0021escape sequence for the exclamation mark (!). Lines 29 65 contain the escape sequences for the other seven languages. The English, French, German, Portuguese and Spanish characters are located in the Basic Latin block, the Japanese characters are located in the Hiragana block, the Russian characters are located in the Cyrillic block and the Traditional Chinese characters are located in the CJK Unified Ideographs block. [Note: To display the output of Unicode.javaproperly, copy the font.properties.zhfile to the font.properties files (located in the C:Program FilesJavaSoftJRE1.3.1lib and in the C:jdk1.3.1jrelib directories). Save the contents of font.propertiesprior to overwriting them with the contents from font.properties.zh.
We would like to recommend you tested and proved virtual web hosting services, which you will surely find to be of great quality.

Frontpage web hosting - Appendix K Unicode (on CD) 1495 public class

Saturday, December 29th, 2007

Appendix K Unicode (on CD) 1495 public class Unicode extends JFrame { 11 private JLabel english, chinese, cyrillic, french, german, 12 japanese, portuguese, spanish; 13 14 // Unicode constructor 15 public Unicode() 16 { 17 super( “Demonstrating Unicode” ); 18 19 // get content pane and set its layout Container container = getContentPane(); 21 container.setLayout( new GridLayout( 8, 1 ) ); 22 23 // JLabel constructor with a string argument 24 english = new JLabel( “u0057u0065u006Cu0063u006F” + 25 “u006Du0065u0020u0074u006Fu0020Unicodeu0021″ ); 26 english.setToolTipText( “This is English” ); 27 container.add( english ); 28 29 chinese = new JLabel( “u6B22u8FCEu4F7Fu7528u0020″ + “u0020Unicodeu0021″ ); 31 chinese.setToolTipText( “This is Traditional Chinese” ); 32 container.add( chinese ); 33 34 cyrillic = new JLabel( “u0414u043Eu0431u0440u043E” + 35 “u0020u043Fu043Eu0436u0430u043Bu043Eu0432″ + 36 “u0430u0422u044Au0020u0432u0020Unicodeu0021″ ); 37 cyrillic.setToolTipText( “This is Russian” ); 38 container.add( cyrillic ); 39 french = new JLabel( “u0042u0069u0065u006Eu0076″ + 41 “u0065u006Eu0075u0065u0020u0061u0075u0020″ + 42 “Unicodeu0021″ ); 43 french.setToolTipText( “This is French” ); 44 container.add( french ); 45 46 german = new JLabel( “u0057u0069u006Cu006Bu006F” + 47 “u006Du006Du0065u006Eu0020u007Au0075u0020″ + 48 “Unicodeu0021″ ); 49 german.setToolTipText( “This is German” ); container.add( german ); 51 52 japanese = new JLabel( “Unicodeu3078u3087u3045u3053″ + 53 “u305Du0021″ ); 54 japanese.setToolTipText( “This is Japanese” ); 55 container.add( hiragana ); 56 57 portuguese = new JLabel( “u0053u00E9u006Au0061u0020″ + 58 “u0042u0065u006Du0076u0069u006Eu0064″ + 59 “u006Fu0020Unicodeu0021″ ); portuguese.setToolTipText( “This is Portuguese” ); 61 container.add( portuguese ); 62 Fig. K.3Java program that uses Unicode encoding (part 2 of 3). Fig. K.
In case you need affordable webhost to host your website, our recommendation is ecommerce web host services.

1494 Unicode (on CD) Appendix K the Unicode (Best web hosting)

Saturday, December 29th, 2007

1494 Unicode (on CD) Appendix K the Unicode characters can be viewed properly. Moreover, from this section, the user can navigate to other sites that provide information on various topics such as, fonts, linguistics and other standards such as the Armenian Standards Page and the Chinese GB 18030 Encoding Standard. The Consortium section consists of five subsections: Who we are, Our Members, How to Join, Press Info and Contact Us. This section provides a list of the current Unicode Consortium members as well as information on how to become a member. Privileges for each member type full, associate, specialist and individual and the fees assessed to each member are listed here. The Unicode Standard section consists of nine subsections: Start Here, Latest Version, Technical Reports, Code Charts, Unicode Data, Update & Errata, Unicode Policies, Glossary and Technical FAQ. This section describes the updates applied to the latest version of the Unicode Standard as well as categorizing all defined encoding. The user can learn how the latest version has been modified to encompass more features and capabilities. For instance, one enhancement of Version 3.1 is that it contains additional encoded characters. Also, if users are unfamiliar with vocabulary terms used by the Unicode Consortium, then they can navigate to the Glossary subsection. The Work in Progress section consists of three subsections: Calendar of Meetings, Proposed Characters and Submitting Proposals. This section presents the user with a catalog of the recent characters included into the Unicode Standard scheme as well as those characters being considered for inclusion. If users determine that a character has been overlooked, then they can submit a written proposal for the inclusion of that character. The Submitting Proposals subsection contains strict guidelines that must be adhered to when submitting written proposals. The For Members section consists of two subsections: Member Resources and Working Documents. These subsections are password protected; only consortium members can access these links. K.6 Using Unicode Numerous programming languages (e.g., C, Java, JavaScript, Perl, Visual Basic, etc.) provide some level of support for the Unicode Standard. Figure K.3 shows a Java program that prints the text Welcome to Unicode! in eight different languages: English, Russian, French, German, Japanese, Portuguese, Spanish and Traditional Chinese. [Note: The Unicode Consortium s Web site contains a link to code charts that lists the 16-bit Unicode code values.] 1 // Fig. K.3: Unicode.java 2 // Demonstrating how to use Unicode in Java programs. 3 4 // Java core packages 5 import java.awt.*; 6 7 // Java extension packages 8 import javax.swing.*; 9 Fig. K.3Java program that uses Unicode encoding (part 1 of 3). Fig. K.
Searching for affordable and proven webhost to host and run your servlet applications? Go to Linux Web Hosting services and you will find it.

Appendix K Unicode (on CD) 1493 K.4 Advantages/Disadvantages

Friday, December 28th, 2007

Appendix K Unicode (on CD) 1493 K.4 Advantages/Disadvantages of Unicode The Unicode Standard has several significant advantages that promote its use. One is the impact it has on the performance of the international economy. Unicode standardizes the characters for the world s writing systems to a uniform model that promotes transferring and sharing data. Programs developed using such a schema maintain their accuracy because each character has a single definition (i.e., a is always U+0061, % is always U+0025). This enables corporations to manage the high demands of international markets by processing different writing systems at the same time. Also, all characters can be managed in an identical manner, thus avoiding any confusion caused by different character code architectures. Moreover, managing data in a consistent manner eliminates data corruption, because data can be sorted, searched and manipulated using a consistent process. Another advantage of the Unicode Standard is portability (i.e., software that can execute on disparate computers or with disparate operating systems). Most operating systems, databases, programming languages and Web browsers currently support, or are planning to support, Unicode. A disadvantage of the Unicode Standard is the amount of memory required by UTF16 and UTF-32. ASCII character sets are 8-bits in length, so they require less storage than the default 16-bit Unicode character set. However, the double-byte character set (DBCS) and the multi-byte character set (MBCS) that encode Asian characters (ideographs) require two to four bytes, respectively. In such instances, the UTF-16 or the UTF-32 encoding forms may be used with little hindrance on memory and performance. Another disadvantage of Unicode is that although it includes more characters than any other character set in common use, it does not yet encode all of the world s written characters. Another disadvantage of the Unicode Standard is that UTF-8 and UTF-16 are variable width encoding forms, so characters occupy different amounts of memory. K.5 Unicode Consortium s Web Site If you would like to learn more about the Unicode Standard, visit www.unicode.org. This site provides a wealth of information about the Unicode Standard that is insightful to those new to Unicode. Currently, the home page is organized into various sections New to Unicode, General Information, The Consortium, The Unicode Standard, Work in Progress and For Members. The New to Unicode section consists of two subsections: What is Unicode and How to Use this Site. The first subsection provides a technical introduction to Unicode by describing design principles, character interpretations and assignments, text processing and Unicode conformance. This subsection is recommended reading for anyone new to Unicode. Also, this subsection provides a list of related links that provide the reader with additional information about Unicode. The How to Use this Site subsection contains information about using and navigating the site as well hyperlinks to additional resources. The General Information section contains six subsections: Where is my Character, Display Problems, Useful Resources, Enabled Products, Mail Lists and Conferences. The main areas covered in this section include a link to the Unicode code charts (a complete listing of code values) assembled by the Unicode Consortium as well as a detailed outline on how to locate an encoded character in the code chart. Also, the section contains advice on how to configure different operating systems and Web browsers so that
We recommend high quality webhost to host and run your jsp application: christian web host services.

1492 Unicode (Web design templates) (on CD) Appendix K being upgraded

Friday, December 28th, 2007

1492 Unicode (on CD) Appendix K being upgraded because it often simplifies changes to existing programs. For this reason, UTF-8 has become the encoding form of choice on the Internet. Likewise, UTF-16 is the encoding form of choice on Microsoft Windows applications. UTF-32 is likely to become more widely used in the future as more characters are encoded with values above FFFF hexadecimal. Also, UTF-32 requires less sophisticated handling than UTF-16 in the presence of surrogate pairs. Figure K.1 shows the different ways in which the three encoding forms handle character encoding. K.3 Characters and Glyphs The Unicode Standard consists of characters, written components (i.e., alphabets, numbers, punctuation marks, accent marks, etc.) that can be represented by numeric values. Examples of characters include: U+0041 LATIN CAPITAL LETTER A. In the first character representation, U+yyyy is a code value, in which U+ refers to Unicode code values, as opposed to other hexadecimal values. The yyyy represents a four-digit hexadecimal number of an encoded character. Code values are bit combinations that represent encoded characters. Characters are represented using glyphs, various shapes, fonts and sizes for displaying characters. There are no code values for glyphs in the Unicode Standard. Examples of glyphs are shown in Fig. K.2. The Unicode Standard encompasses the alphabets, ideographs, syllabaries, punctuation marks, diacritics, mathematical operators, etc. that comprise the written languages and scripts of the world. A diacritic is a special mark added to a character to distinguish it from another letter or to indicate an accent (e.g., in Spanish, the tilde ~ above the character n ). Currently, Unicode provides code values for 94,140 character representations, with more than 880,000 code values reserved for future expansion. Character UTF-8 UTF-16 UTF-32 LATIN CAPITAL LETTER A 0×41 0×0041 0×00000041 GREEK CAPITAL LETTER 0xCD 0×91 0×0391 0×00000391 ALPHA CJK UNIFIED IDEOGRAPH0xE4 0xBA 0×95 0×4E95 0×00004E95 4E95 OLD ITALIC LETTER A 0xF0 0×80 0×83 0×80 0xDC00 0xDF00 0×00010300 Fig. K.1Fi Correlation between the three encoding forms. Fig. K.2Various glyphs of the character A. Fig.
If you are looking for affordable and reliable webhost to host and run your business application visit our ftp web hosting services.

Appendix K Unicode (on CD) 1491 Unicode Consortium,

Friday, December 28th, 2007

Appendix K Unicode (on CD) 1491 Unicode Consortium, whose members include Apple, IBM, Microsoft, Oracle, Sun Micro- systems, Sybase and many others. When the Consortium envisioned and developed the Unicode Standard, they wanted an encoding system that was universal, efficient, uniform and unambiguous. A universal encoding system encompasses all commonly used characters. An efficient encoding system allows text files to be parsed easily. A uniform encoding system assigns fixed values to all characters. An unambiguous encoding system represents a given character in a consistent manner. These four terms are referred to as the Unicode Standard design basis. K.2 Unicode Transformation Formats Although Unicode incorporates the limited ASCII character set (i.e., a collection of characters), it encompasses a more comprehensive character set. In ASCII each character is represented by a byte containing 0s and 1s. One byte is capable of storing the binary numbers from 0 to 255. Each character is assigned a number between 0 and 255, thus ASCII-based systems can support only 256 characters, a tiny fraction of world s characters. Unicode extends the ASCII character set by encoding the vast majority of the world s characters. The Unicode Standard encodes all of those characters in a uniform numerical space from 0 to 10FFFF hexadecimal. An implementation will express these numbers in one of several transformation formats, choosing the one that best fits the particular application at hand. Three such formats are in use, called UTF-8, UTF-16 and UTF-32, depending on the size of the units in bits being used. UTF-8, a variable width encoding form, requires one to four bytes to express each Unicode character. UTF-8 data consists of 8-bit bytes (sequences of one, two, three or four bytes depending on the character being encoded) and is well suited for ASCII-based systems when there is a predominance of one-byte characters (ASCII represents characters as one-byte). Currently, UTF-8 is widely implemented in UNIX systems and in databases. The variable width UTF-16 encoding form expresses Unicode characters in units of 16-bits (i.e., as two adjacent bytes, or a short integer in many machines). Most characters of Unicode are expressed in a single 16-bit unit. However, characters with values above FFFF hexadecimal are expressed with an ordered pair of 16-bit units called surrogates. Surrogates are 16-bit integers in the range D800 through DFFF, which are used solely for the purpose of escaping into higher numbered characters. Approximately one million characters can be expressed in this manner. Although a surrogate pair requires 32-bits to represent characters, it is space-efficient to use these 16-bit units. Surrogates are rare characters in current implementations. Many string-handling implementations are written in terms of UTF-16. [Note: Details and sample-code for UTF-16 handling are available on the Unicode Consortium Web site at www.unicode.org.] Implementations that require significant use of rare characters or entire scripts encoded above FFFF hexadecimal, should use UTF-32, a 32-bit fixed-width encoding form that usually requires twice as much memory as UTF-16 encoded characters. The major advantage of the fixed-width UTF-32 encoding form is that it uniformly expresses all characters, so it is easy to handle in arrays. There are few guidelines that state when to use a particular encoding form. The best encoding form to use depends on computer systems and business protocols, not on the data itself. Typically, the UTF-8 encoding form should be used where computer systems and business protocols require data to be handled in 8-bit units, particularly in legacy systems
Looking for affordable and reliable webhost to host and run your business application? Then look no more and go to servlet web hosting services.

1490 Unicode (on (My space web page) CD) Appendix K Outline K.1

Thursday, December 27th, 2007

1490 Unicode (on CD) Appendix K Outline K.1 Introduction K.2 Unicode Transformation Formats K.3 Characters and Glyphs K.4 Advantages/Disadvantages of Unicode K.5 Unicode Consortium s Web Site K.6 Using Unicode K.7 Character Ranges Summary Terminology Self-Review Exercises Answers to Self-Review Exercises Exercises K.1 Introduction The use of inconsistent character encodings (i.e., numeric values associated with characters) when developing global software products causes serious problems because computers process information using numbers. For instance, the character a is converted to a numeric value so that a computer can manipulate that piece of data. Many countries and corporations have developed their own encoding systems that are incompatible with the encoding systems of other countries and corporations. For example, the Microsoft Windows operating system assigns the value 0xC0 to the character A with a grave accent while the Apple Macintosh operating system assigns that same value to an upside-down question mark. This results in the misrepresentation and possible corruption of data because data is not processed as intended. In the absence of a widely-implemented universal character encoding standard, global software developers had to localize their products extensively before distribution. Localization includes the language translation and cultural adaptation of content. The process of localization usually includes significant modifications to the source code (such as the conversion of numeric values and the underlying assumptions made by programmers), which results in increased costs and delays releasing the software. For example, some English- speaking programmers might design global software products assuming that a single character can be represented by one byte. However, when those products are localized in Asian markets, the programmer s assumptions are no longer valid, thus the majority, if not the entirety, of the code needs to be rewritten. Localization is necessary with each release of a version. By the time a software product is localized for a particular market, a newer version, which needs to be localized as well, is ready for distribution. As a result, it is cumbersome and costly to produce and distribute global software products in a market where there is no universal character encoding standard. In response to this situation, the Unicode Standard, an encoding standard that facilitates the production and distribution of software, was created. The Unicode Standard outlines a specification to produce consistent encoding of the world s characters and symbols. Software products which handle text encoded in the Unicode Standard need to be localized, but the localization process is simpler and more efficient because the numeric values need not be converted and the assumptions made by programmers about the character encoding are universal. The Unicode Standard is maintained by a non-profit organization called the
Looking for affordable and reliable webhost to host and run your business application? Then look no more and go to servlet web hosting services.

Web server version - K Unicode (on CD) Objectives To become

Thursday, December 27th, 2007

K Unicode (on CD) Objectives To become familiar with Unicode. To discuss the mission of the Unicode Consortium. To discuss the design basis of Unicode. To understand the three Unicode encoding forms: UTF-8, UTF-16 and UTF-32. To introduce characters and glyphs. To discuss the advantages and disadvantages of using Unicode. To provide a brief tour of the Unicode Consortium s Web site.
Note: In case you are looking for affordable and reliable webhost to host and run your j2ee application check Vision J2ee Web Hosting services.