<!-- received="Tue Aug 10 00:33:40 1999 EET DST" -->
<!-- sent="Mon, 9 Aug 1999 23:04:22 +0200 (MET DST)" -->
<!-- name="Bruno Haible" -->
<!-- email="haible@ilog.fr" -->
<!-- subject="Re: [PATCH] tty improvement for Unicode/UTF-8 mode" -->
<!-- id="199908092116.XAA05855@jaures.ilog.fr" -->
<!-- inreplyto="199908081530.RAA28202@jaures.ilog.fr" -->
<title>Linux-kernel mailing list archive 1999-32,: Re: [PATCH] tty improvement for Unicode/UTF-8 mode</title>
<body bgcolor="#FFFFFF"><font face="Arial,Helvetica">
<h1>Re: [PATCH] tty improvement for Unicode/UTF-8 mode</h1>
<b>Bruno Haible</b> (<a href="mailto:haible@ilog.fr"><i>haible@ilog.fr</i></a>)<br>
<i>Mon, 9 Aug 1999 23:04:22 +0200 (MET DST)</i>
<p>
<ul>
<li> <b>Messages sorted by:</b> <a href="date.html#274">[ date ]</a><a href="index.html#274">[ thread ]</a><a href="subject.html#274">[ subject ]</a><a href="author.html#274">[ author ]</a>
<!-- next="start" -->
<li> <b>Next message:</b> <a href="0275.html">M. Berglund: "linux scheduling"</a>
<li> <b>Previous message:</b> <a href="0273.html">Christos Ricudis: "struct device"</a>
<li> <b>In reply to:</b> <a href="0074.html">Bruno Haible: "[PATCH] tty improvement for Unicode/UTF-8 mode"</a>
<!-- nextthread="start" -->
<li> <b>Next in thread:</b> <a href="0344.html">Mailing lists: "Re: [PATCH] tty improvement for Unicode/UTF-8 mode"</a>
<li> <b>Reply:</b> <a href="0344.html">Mailing lists: "Re: [PATCH] tty improvement for Unicode/UTF-8 mode"</a>
<!-- reply="end" -->
</ul>
<hr>
<!-- body="start" -->
Same patch again, in unidiff format now.<br>
<p>
For inclusion in the development kernel:<br>
<p>
The patch below fixes the "cooked" mode editing behaviour of ttys in<br>
Unicode/UTF-8 mode. The line editor in n_tty.c up to now assumes that<br>
every byte &gt;= 0x20 is a character and occupies one screen position. For<br>
ttys in UTF-8 mode, this is not true any more. The patch fixes the two<br>
following problems in line editing on UTF-8 ttys:<br>
<p>
  1. When the user types BackSpace, a multi-byte character has to be<br>
     erased, not only a single byte. Also, in ECHOPRT mode, the entire<br>
     multi-byte character has to be echoed to the screen, not only one<br>
     byte.<br>
<p>
  2. When the user types a Tab or backspaces over a Tab, the kernel<br>
     needs to have the proper notion of the column number of the cursor<br>
     (tty-&gt;column). For a multi-byte character, the column number increases<br>
     by 1, not by the number of bytes that make up the character.<br>
<p>
The program which sets up the tty (xterm, rlogind, telnetd etc.) has to tell<br>
the kernel that the tty will be in UTF-8 mode. For this purpose, a new<br>
tty attribute is introduced, part of the "struct termio" structure.<br>
<p>
The patch has been tested with console and xterm in UTF-8 mode, directly<br>
and across rlogin and telnet.<br>
<p>
Bruno<br>
<p>
<p>
--- linux-2.3.12/include/linux/tty.h.bak	Wed Jul 28 22:56:37 1999<br>
+++ linux-2.3.12/include/linux/tty.h	Sun Aug  8 15:27:01 1999<br>
@@ -195,6 +195,7 @@<br>
 #define I_IXANY(tty)	_I_FLAG((tty),IXANY)<br>
 #define I_IXOFF(tty)	_I_FLAG((tty),IXOFF)<br>
 #define I_IMAXBEL(tty)	_I_FLAG((tty),IMAXBEL)<br>
+#define I_IUTF8(tty)	_I_FLAG((tty),IUTF8)<br>
 <br>
 #define O_OPOST(tty)	_O_FLAG((tty),OPOST)<br>
 #define O_OLCUC(tty)	_O_FLAG((tty),OLCUC)<br>
--- linux-2.3.12/include/asm-i386/termbits.h.bak	Fri Jan  8 20:11:45 1999<br>
+++ linux-2.3.12/include/asm-i386/termbits.h	Sun Aug  8 15:27:00 1999<br>
@@ -51,6 +51,7 @@<br>
 #define IXANY	0004000<br>
 #define IXOFF	0010000<br>
 #define IMAXBEL	0020000<br>
+#define IUTF8	0040000<br>
 <br>
 /* c_oflag bits */<br>
 #define OPOST	0000001<br>
--- linux-2.3.12/include/asm-mips/termbits.h.bak	Fri Jan  8 20:11:45 1999<br>
+++ linux-2.3.12/include/asm-mips/termbits.h	Sun Aug  8 15:27:00 1999<br>
@@ -80,6 +80,9 @@<br>
 #if defined (__USE_BSD) || defined (__KERNEL__)<br>
 #define IMAXBEL	0020000		/* Ring bell when input queue is full.  */<br>
 #endif<br>
+#if defined (__USE_GNU) || defined (__KERNEL__)<br>
+#define IUTF8	0040000<br>
+#endif<br>
 <br>
 /* c_oflag bits */<br>
 #define OPOST	0000001		/* Perform output processing.  */<br>
--- linux-2.3.12/include/asm-alpha/termbits.h.bak	Fri Jan  8 20:11:45 1999<br>
+++ linux-2.3.12/include/asm-alpha/termbits.h	Sun Aug  8 15:27:00 1999<br>
@@ -62,6 +62,9 @@<br>
 # define IUCLC		0010000<br>
 # define IMAXBEL	0020000<br>
 #endif<br>
+#if defined(__KERNEL__) || defined(__USE_GNU)<br>
+# define IUTF8		0040000<br>
+#endif<br>
 <br>
 /* c_oflag bits */<br>
 #define OPOST	0000001<br>
--- linux-2.3.12/include/asm-m68k/termbits.h.bak	Fri Jan  8 20:11:45 1999<br>
+++ linux-2.3.12/include/asm-m68k/termbits.h	Sun Aug  8 15:27:00 1999<br>
@@ -52,6 +52,7 @@<br>
 #define IXANY	0004000<br>
 #define IXOFF	0010000<br>
 #define IMAXBEL	0020000<br>
+#define IUTF8	0040000<br>
 <br>
 /* c_oflag bits */<br>
 #define OPOST	0000001<br>
--- linux-2.3.12/include/asm-sparc/termbits.h.bak	Thu Mar 11 01:53:37 1999<br>
+++ linux-2.3.12/include/asm-sparc/termbits.h	Sun Aug  8 15:27:01 1999<br>
@@ -78,6 +78,7 @@<br>
 #define IXANY	0x00000800<br>
 #define IXOFF	0x00001000<br>
 #define IMAXBEL	0x00002000<br>
+#define IUTF8	0x00004000<br>
 <br>
 /* c_oflag bits */<br>
 #define OPOST	0x00000001<br>
--- linux-2.3.12/include/asm-ppc/termbits.h.bak	Thu Mar 11 06:30:32 1999<br>
+++ linux-2.3.12/include/asm-ppc/termbits.h	Sun Aug  8 15:27:01 1999<br>
@@ -63,6 +63,9 @@<br>
 # define IUCLC		0010000<br>
 # define IMAXBEL	0020000<br>
 #endif<br>
+#if defined(__KERNEL__) || defined(__USE_GNU)<br>
+# define IUTF8		0040000<br>
+#endif<br>
 <br>
 /* c_oflag bits */<br>
 #define OPOST	0000001<br>
--- linux-2.3.12/include/asm-sparc64/termbits.h.bak	Thu Mar 11 01:53:38 1999<br>
+++ linux-2.3.12/include/asm-sparc64/termbits.h	Sun Aug  8 15:27:01 1999<br>
@@ -80,6 +80,7 @@<br>
 #define IXANY	0x00000800<br>
 #define IXOFF	0x00001000<br>
 #define IMAXBEL	0x00002000<br>
+#define IUTF8	0x00004000<br>
 <br>
 /* c_oflag bits */<br>
 #define OPOST	0x00000001<br>
--- linux-2.3.12/include/asm-arm/termbits.h.bak	Fri Jan  8 20:11:45 1999<br>
+++ linux-2.3.12/include/asm-arm/termbits.h	Sun Aug  8 15:27:00 1999<br>
@@ -51,6 +51,7 @@<br>
 #define IXANY	0004000<br>
 #define IXOFF	0010000<br>
 #define IMAXBEL	0020000<br>
+#define IUTF8	0040000<br>
 <br>
 /* c_oflag bits */<br>
 #define OPOST	0000001<br>
--- linux-2.3.12/drivers/char/n_tty.c.bak	Tue May 11 23:37:40 1999<br>
+++ linux-2.3.12/drivers/char/n_tty.c	Sun Aug  8 15:27:01 1999<br>
@@ -178,7 +178,7 @@<br>
 		default:<br>
 			if (O_OLCUC(tty))<br>
 				c = toupper(c);<br>
-			if (!iscntrl(c))<br>
+			if (!iscntrl(c) &amp;&amp; !(I_IUTF8(tty) &amp;&amp; ((c &amp; 0xC0) == 0x80)))<br>
 				tty-&gt;column++;<br>
 			break;<br>
 		}<br>
@@ -282,7 +282,7 @@<br>
 static void eraser(unsigned char c, struct tty_struct *tty)<br>
 {<br>
 	enum { ERASE, WERASE, KILL } kill_type;<br>
-	int head, seen_alnums;<br>
+	int head, seen_alnums, cnt;<br>
 <br>
 	if (tty-&gt;read_head == tty-&gt;canon_head) {<br>
 		/* opost('\a', tty); */		/* what do you think? */<br>
@@ -315,8 +315,19 @@<br>
 <br>
 	seen_alnums = 0;<br>
 	while (tty-&gt;read_head != tty-&gt;canon_head) {<br>
-		head = (tty-&gt;read_head - 1) &amp; (N_TTY_BUF_SIZE-1);<br>
-		c = tty-&gt;read_buf[head];<br>
+		head = tty-&gt;read_head;<br>
+		if (I_IUTF8(tty)) {<br>
+			/* erase a multi-byte character */<br>
+			do {<br>
+				head = (head - 1) &amp; (N_TTY_BUF_SIZE-1);<br>
+				c = tty-&gt;read_buf[head];<br>
+			} while (((c &amp; 0xC0) == 0x80) &amp;&amp; (head != tty-&gt;canon_head));<br>
+			if ((c &amp; 0xC0) == 0x80)<br>
+				break;<br>
+		} else {<br>
+			head = (head - 1) &amp; (N_TTY_BUF_SIZE-1);<br>
+			c = tty-&gt;read_buf[head];<br>
+		}<br>
 		if (kill_type == WERASE) {<br>
 			/* Equivalent to BSD's ALTWERASE. */<br>
 			if (isalnum(c) || c == '_')<br>
@@ -324,8 +335,9 @@<br>
 			else if (seen_alnums)<br>
 				break;<br>
 		}<br>
+		cnt = (tty-&gt;read_head - head) &amp; (N_TTY_BUF_SIZE-1);<br>
+		tty-&gt;read_cnt -= cnt;<br>
 		tty-&gt;read_head = head;<br>
-		tty-&gt;read_cnt--;<br>
 		if (L_ECHO(tty)) {<br>
 			if (L_ECHOPRT(tty)) {<br>
 				if (!tty-&gt;erasing) {<br>
@@ -333,7 +345,12 @@<br>
 					tty-&gt;column++;<br>
 					tty-&gt;erasing = 1;<br>
 				}<br>
+				/* if cnt &gt; 1, output a multi-byte character */<br>
 				echo_char(c, tty);<br>
+				while (--cnt &gt; 0) {<br>
+					head = (head+1) &amp; (N_TTY_BUF_SIZE-1);<br>
+					put_char(tty-&gt;read_buf[head], tty);<br>
+				}<br>
 			} else if (kill_type == ERASE &amp;&amp; !L_ECHOE(tty)) {<br>
 				echo_char(ERASE_CHAR(tty), tty);<br>
 			} else if (c == '\t') {<br>
@@ -348,7 +365,7 @@<br>
 					else if (iscntrl(c)) {<br>
 						if (L_ECHOCTL(tty))<br>
 							col += 2;<br>
-					} else<br>
+					} else if (!(I_IUTF8(tty) &amp;&amp; ((c &amp; 0xC0) == 0x80)))<br>
 						col++;<br>
 					tail = (tail+1) &amp; (N_TTY_BUF_SIZE-1);<br>
 				}<br>
<p>
-<br>
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in<br>
the body of a message to majordomo@vger.rutgers.edu<br>
Please read the FAQ at <a href="http://www.tux.org/lkml/">http://www.tux.org/lkml/</a><br>
<!-- body="end" -->
<hr>
<p>
<ul>
<!-- next="start" -->
<li> <b>Next message:</b> <a href="0275.html">M. Berglund: "linux scheduling"</a>
<li> <b>Previous message:</b> <a href="0273.html">Christos Ricudis: "struct device"</a>
<li> <b>In reply to:</b> <a href="0074.html">Bruno Haible: "[PATCH] tty improvement for Unicode/UTF-8 mode"</a>
<!-- nextthread="start" -->
<li> <b>Next in thread:</b> <a href="0344.html">Mailing lists: "Re: [PATCH] tty improvement for Unicode/UTF-8 mode"</a>
<li> <b>Reply:</b> <a href="0344.html">Mailing lists: "Re: [PATCH] tty improvement for Unicode/UTF-8 mode"</a>
<!-- reply="end" -->
</ul>
</font></body>
