HP 9000 User Manual page 150

Computers

Hide thumbs Also See for 9000:

Administration manual (386 pages)

Manual (165 pages)

User manual (110 pages)

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

page of 234

/ 234
Contents
Table of Contents
Bookmarks

Table of Contents

Code Sets

One objective of international program design is to create an application that

is codeset independent. To create a program that is sufficiently robust to

accept any kind of codeset, you must know how data is represented in different

languages and the potential problems you can encounter.

As a UNIX user, you are probably familiar with ASCII, the 7-bit codeset used

to support American English. All codesets supporting the diverse languages of

international users are supersets of the familiar ASCII. This ensures that these

codesets can communicate with the operating system, utilities, and applications

which have a dependency on ASCII.

The ISO 8859-1 and Roman8 codesets support Western European languages.

These 8-bit codesets support an additional 128 character codes beyond those

of ASCII. While this extension of the ASCII character set meets the needs of

Western European users, it is not large enough to support languages such as

Arabic and Greek that have alphabets completely different from those used in

Western Europe or the U.S. For these languages, other 8-bit codesets have been

designed such as ARABIC8 and GREEK8. ISO 8859-2 and ISO 8859-5 are

used for supporting Eastern European languages such as Polish and Russian

(a complete list of codesets and the languages they support is provided in

Appendix E).

8-bit codesets provide support to international users who speak and write

phonetic languages. A single byte, however, is not sufficient to represent the

symbols of users whose language is ideographic (for example, Traditional

Chinese which contains over 50,000 distinct ideographs). To provide for these

users, codesets that support multi-byte characters were introduced.

With the introduction of encoding schemes with multi-byte characters a

problem arose. Because users who read and write ideographics still need ASCII

(for communicating with the operating system and backwards compatibility),

it becomes possible to have a data stream consisting of a mixture of one and

two byte characters. The resulting problem is one of character interpretation:

How can a program interpret characters correctly, distinguishing between single

and multi-byte characters? A number of solutions to this problem have been

designed.

A group of 2-byte codesets were developed that adhere to a common definition

for interpreting a byte stream called HP15. All codesets that adhere to the

Special Topics for HP's 16-bit Interfaces A-3

Table of Contents

Chapters

Table of Contents

HP 9000 User Manual page 150

Chapters

Related Manuals for HP 9000

Related Products for HP 9000

Table of Contents