{"id":10561,"date":"2021-09-20T10:00:00","date_gmt":"2021-09-20T01:00:00","guid":{"rendered":"https:\/\/www.gigas-jp.com\/appnews\/?p=10561"},"modified":"2021-09-17T19:45:13","modified_gmt":"2021-09-17T10:45:13","slug":"handling-japanese-characters-in-php","status":"publish","type":"post","link":"https:\/\/www.gigas-jp.com\/appnews\/archives\/10561","title":{"rendered":"Handling Japanese characters in PHP"},"content":{"rendered":"\n<p>We know we have manipulate strings, characters in PHP using various function depending on what we need. For example &#8211; cutting strings, counting strings length, replacing strings etc. We can directly use built-in PHP function like <strong>substr<\/strong>, <strong>str_replace, str_length <\/strong>etc. But we can&#8217;t directly use these functions for Japanese characters, Why ?<\/p>\n\n\n\n<p>Everyone knows that a &#8220;bit&#8221; is 0 or 1, nothing else, and a &#8220;byte&#8221; is a group of eight consecutive bits. Since one byte has eight of these dual value points, the byte can consist of a total of 256 different patterns (2 power 8). Different characters can be associated with each possible 8-bit pattern.<\/p>\n\n\n\n<p>It is working fine as long as the language characters can be represented by 256 or less. <\/p>\n\n\n\n<p>But what if you can&#8217;t represent a language with just 256 characters? Obviously Japanese characters need more than that. Nowadays , 256 characters isn&#8217;t enough anywhere. Fortunately, the new super character sets use anywhere from 1 to 4 bytes to define characters. Unicode, a scheme that uses multiple bytes to represent characters. There are several version of it like UTF-32, 26 8.<\/p>\n\n\n\n<p>Unicode (including UTF-8) uses multiple byte configurations to represent characters. UTF-8 uses 1 to 4 bytes to generate 1,112,064 patterns that represent different characters.<\/p>\n\n\n\n<p>We can&#8217;t still directly use string related functions by declaring UTF-8. PHP isn&#8217;t really designed to handle multibyte characters, so using standard string functions to handle these characters can have uncertain results. If you need to handle these multibyte characters, you need to use a special set of functions, the mbstring function. Use the <code>--enable-mbstring<\/code> compile-time option to enable the mb function and set the run-time configuration option <code>mbstring-encoding_translation<\/code>.<\/p>\n\n\n\n<p>The next thing is HTTP header might covers the communication also contains the character set ID, so we need to declare the header also like this.<\/p>\n\n\n\n<p><code>mb_internal_encoding(\"UTF-8\");<\/code><\/p>\n\n\n\n<p>Finally we can use mb string related function instead of directly using string function. <\/p>\n\n\n\n<p>For example we can use <code>mb_strlen<\/code> instead of <code>strlen<\/code> .<\/p>\n\n\n\n<p>You can see various mb functions <a href=\"https:\/\/www.php.net\/manual\/en\/ref.mbstring.php\">here<\/a>.<\/p>\n\n\n\n<p>That&#8217;s all for today.<\/p>\n\n\n\n<p>Yuuma<\/p>\n<div class='wp_social_bookmarking_light'>\n            <div class=\"wsbl_google_plus_one\"><g:plusone size=\"medium\" annotation=\"none\" href=\"https:\/\/www.gigas-jp.com\/appnews\/archives\/10561\" ><\/g:plusone><\/div>\n            <div class=\"wsbl_hatena_button\"><a href=\"\/\/b.hatena.ne.jp\/entry\/https:\/\/www.gigas-jp.com\/appnews\/archives\/10561\" class=\"hatena-bookmark-button\" data-hatena-bookmark-title=\"Handling Japanese characters in PHP\" data-hatena-bookmark-layout=\"standard\" title=\"\u3053\u306e\u30a8\u30f3\u30c8\u30ea\u30fc\u3092\u306f\u3066\u306a\u30d6\u30c3\u30af\u30de\u30fc\u30af\u306b\u8ffd\u52a0\"> <img src=\"\/\/b.hatena.ne.jp\/images\/entry-button\/button-only@2x.png\" alt=\"\u3053\u306e\u30a8\u30f3\u30c8\u30ea\u30fc\u3092\u306f\u3066\u306a\u30d6\u30c3\u30af\u30de\u30fc\u30af\u306b\u8ffd\u52a0\" width=\"20\" height=\"20\" style=\"border: none;\" \/><\/a><script type=\"text\/javascript\" src=\"\/\/b.hatena.ne.jp\/js\/bookmark_button.js\" charset=\"utf-8\" async=\"async\"><\/script><\/div>\n            <div class=\"wsbl_twitter\"><a href=\"https:\/\/twitter.com\/share\" class=\"twitter-share-button\" data-url=\"https:\/\/www.gigas-jp.com\/appnews\/archives\/10561\" data-text=\"Handling Japanese characters in PHP\" data-via=\"GIGASJAPAN_APPS\" data-lang=\"ja\">Tweet<\/a><\/div>\n            <div class=\"wsbl_facebook_like\"><div id=\"fb-root\"><\/div><fb:like href=\"https:\/\/www.gigas-jp.com\/appnews\/archives\/10561\" layout=\"button_count\" action=\"like\" width=\"100\" share=\"false\" show_faces=\"false\" ><\/fb:like><\/div>\n            <div class=\"wsbl_facebook_send\"><div id=\"fb-root\"><\/div><fb:send href=\"https:\/\/www.gigas-jp.com\/appnews\/archives\/10561\" colorscheme=\"light\" ><\/fb:send><\/div>\n    <\/div>\n<br class='wp_social_bookmarking_light_clear' \/>\n","protected":false},"excerpt":{"rendered":"<p>We know we have manipulate strings, characters in PHP using various function depending on what we need. For ex [&hellip;]<\/p>\n","protected":false},"author":18,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[100],"tags":[],"acf":[],"_links":{"self":[{"href":"https:\/\/www.gigas-jp.com\/appnews\/wp-json\/wp\/v2\/posts\/10561"}],"collection":[{"href":"https:\/\/www.gigas-jp.com\/appnews\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.gigas-jp.com\/appnews\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.gigas-jp.com\/appnews\/wp-json\/wp\/v2\/users\/18"}],"replies":[{"embeddable":true,"href":"https:\/\/www.gigas-jp.com\/appnews\/wp-json\/wp\/v2\/comments?post=10561"}],"version-history":[{"count":1,"href":"https:\/\/www.gigas-jp.com\/appnews\/wp-json\/wp\/v2\/posts\/10561\/revisions"}],"predecessor-version":[{"id":10566,"href":"https:\/\/www.gigas-jp.com\/appnews\/wp-json\/wp\/v2\/posts\/10561\/revisions\/10566"}],"wp:attachment":[{"href":"https:\/\/www.gigas-jp.com\/appnews\/wp-json\/wp\/v2\/media?parent=10561"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.gigas-jp.com\/appnews\/wp-json\/wp\/v2\/categories?post=10561"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.gigas-jp.com\/appnews\/wp-json\/wp\/v2\/tags?post=10561"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}