Re: Packet Number Encryption Performance

Kazuho Oku <kazuhooku@gmail.com> Sat, 23 June 2018 22:10 UTC

Return-Path: <kazuhooku@gmail.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 257F6130F00 for <quic@ietfa.amsl.com>; Sat, 23 Jun 2018 15:10:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.998
X-Spam-Level:
X-Spam-Status: No, score=-1.998 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YLaxRbRJM9z7 for <quic@ietfa.amsl.com>; Sat, 23 Jun 2018 15:10:35 -0700 (PDT)
Received: from mail-pf0-x236.google.com (mail-pf0-x236.google.com [IPv6:2607:f8b0:400e:c00::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9BFD9130EF8 for <quic@ietf.org>; Sat, 23 Jun 2018 15:10:35 -0700 (PDT)
Received: by mail-pf0-x236.google.com with SMTP id u16-v6so983158pfh.3 for <quic@ietf.org>; Sat, 23 Jun 2018 15:10:35 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=Futd3ehDf4gvH6VLaYSbHTp8RpiCgVwzgwpOvXbMHCY=; b=YHrYaYJk+nn+vjKj7kuj+3HtScGHwdyhudWRD6exJ73Bo0S2gRCfYYbkO+q6KoVwEL C8FmwB7oEHNFapo1OiPHnk+2ZYHy8Stq9t3rfB8jJTSuxmE/5aBJNlcvi1y+5K0j8FqG YDdQ49Gftpp0LKdmwX0qos2xsMHbtVlTmIpCILKfbmeZT31dd5FLyrUcc4N7QGcWvfxb AyGHXeFh34g8zkGqVyiXaTSopxglR/suTStLSib5tWF15hQiaLn/sVV25HWZrIwFXV1F eHZ/oV+KKEvB8kaOEu+a9mrd0Ad2GmfxX/ubBIompFi4kAijSPC+A7zmmRAgpmLvB8zd 6ppg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=Futd3ehDf4gvH6VLaYSbHTp8RpiCgVwzgwpOvXbMHCY=; b=Pl9cekpGNU4uyPptc4O/uVP9M9Toc5zDA/W8mpAxwvEoCbMMkr9oKbXVb4C6NPYMQI o+fY9oeceCqs35neXg6cp/w92Iu0Cgrh573HnjLUgn+K3t6/870CA8CfC3y1GWJniqj2 ++GDhdeWd0/qBbB94Lz7zDjxQhOiH2m6+KHxdAOEQaosSjB2k93OFCsnOs3pvh1wWvp8 4aAnfB/LqldeAGc4/2x101XQKgrPrTeWuyWrQcfSgYlTgrwEnczM6jltdxynJwN1LEX7 4fbNwWLhq9KrnUFqHe5IVL30iUjWazpUxyHP90t7SoULPG6bJ2KtlNBiMRAnxe5COxuA Pzrw==
X-Gm-Message-State: APt69E323bxyKL2vqM2vkIyAXnfPUVeK2+tw7dvnxFe6NHmGhGoTuTI7 wyaw16c0EVr+XE3IQ6AXR7VyDmDYqIhKgo//5J4=
X-Google-Smtp-Source: ADUXVKKXAmhC/4T+3jLSaRC95TIb9UikWA6FfJtI8le1pVhbyY7qFoO+cgaujMitng32z3s22uOvHOBxykiWhKeZrRo=
X-Received: by 2002:a63:b305:: with SMTP id i5-v6mr5885084pgf.370.1529791834843; Sat, 23 Jun 2018 15:10:34 -0700 (PDT)
MIME-Version: 1.0
Received: by 2002:a17:90a:1181:0:0:0:0 with HTTP; Sat, 23 Jun 2018 15:10:33 -0700 (PDT)
In-Reply-To: <MW2PR2101MB0908B9F435EC23FB9152DC3BB3740@MW2PR2101MB0908.namprd21.prod.outlook.com>
References: <DM5PR2101MB0901FCB1094A124818A0B1FEB3760@DM5PR2101MB0901.namprd21.prod.outlook.com> <CANatvzxVBq1-UKiuixWGFfFyWMh8SYpp=y2LqYwiF=tHT6oOOQ@mail.gmail.com> <DM5PR2101MB0901C834F1FDFEC6D0D50781B3750@DM5PR2101MB0901.namprd21.prod.outlook.com> <CANatvzz0u=oy1j2_6=bn6bcuwzQv_6fVqe3WkBtjwaAZ8Bfh=w@mail.gmail.com> <CANatvzysRVQXsB0ZCReY3n_R_kZT-jhmYwR-7-2KYt5+GZCk0A@mail.gmail.com> <CAKcm_gPxYu9jNFmYR0_vQfawuC+T_E9UJbcDPOycrUAMuVJabg@mail.gmail.com> <CY4PR21MB06303A8C17796335F3A3FDE2B6750@CY4PR21MB0630.namprd21.prod.outlook.com> <DM5PR2101MB0901939C8975A87AA74219B9B3750@DM5PR2101MB0901.namprd21.prod.outlook.com> <CAKcm_gMc6y_2+KU3L+XpifNK4JESFA0V=OX4Nj51jTFfAm9M1A@mail.gmail.com> <CANatvzzpZTh516VB40gHiMZq5oOjNRMPpNouCRd-WosEcKaisA@mail.gmail.com> <DM5PR2101MB09014CF646E0DA727AD79486B3740@DM5PR2101MB0901.namprd21.prod.outlook.com> <CANatvzzS6CZ3a0_xQpsPF4RFpZ7+cfQYB7dsT0oAjpHxXv+9+A@mail.gmail.com> <MW2PR2101MB0908B9F435EC23FB9152DC3BB3740@MW2PR2101MB0908.namprd21.prod.outlook.com>
From: Kazuho Oku <kazuhooku@gmail.com>
Date: Sun, 24 Jun 2018 07:10:33 +0900
Message-ID: <CANatvzz33LsTMZbEbUq3R9MquFoF2VUP=-3hZq3fxvM7=GDkYQ@mail.gmail.com>
Subject: Re: Packet Number Encryption Performance
To: Nick Banks <nibanks@microsoft.com>
Cc: IETF QUIC WG <quic@ietf.org>, Ian Swett <ianswett@google.com>, Praveen Balasubramanian <pravb@microsoft.com>
Content-Type: multipart/alternative; boundary="000000000000e57101056f566a74"
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/-ldNYyush8dM6RWjuBqORfWnfgM>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.26
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 23 Jun 2018 22:10:42 -0000

2018-06-23 23:54 GMT+09:00 Nick Banks <nibanks@microsoft.com>:

> I agree that the numbers for real work loads will be interesting, but I do
> believe that raw QUIC stack numbers are also important. They will represent
> your best case scenario.
>
>
>

To clarify, the reason I asked about the raw numbers and the use-case is
because without that I do not know how to interpret your numbers.

My understanding is that your benchmark is about encrypting and sending
large QUIC packets.

Assuming that BCRYPT provides comparable performance to OpenSSL, the
numbers you show (i.e. 22.5% in user-mode, 38.5% in kernel-mode) means that
you want to reduce crypto cost when a single CPU core is emitting somewhere
between 5Gbps to 8 Gbps.

I wonder how much we would be interested in such case. Let me explain why.

Assuming that we have many connections, each CPU core will only handle
somewhere around 1Gbps to saturate a 25Gbps link. As I have stated in my
previous mail, the crypto cost will be around or less than 10% in such case.

Assuming that we are interested in utilizing a 25Gbps link for a single
connection (or a small number of connections), I think that we would be
generally considering of distributing both UDP send and encryption to
multiple CPU cores, because even without crypto, it is hard to utilize the
entire bandwidth using just one core (the same goes for the receive side).
And if you distribute the load to your CPU cores, the crypto cost becomes
somewhere around or below 10%.

This is my understanding. But I also understand that it could well be the
case that I do not understand what your workload looks like. That's why I
wonder if you could share the raw numbers and the use-case.

For my testing, I have a fairly simple test of opening a single
> unidirectional stream and just continually sending data until the core gets
> saturated. From that, I grab the performance trace, and specifically look
> at the composition of the QUIC stack’s send path. The send path constitutes
> generally all of the CPU for the whole connection anyways (with just a
> little overhead seen from receiving and processing ACKs). The machine I
> used was another server grade machine running the latest Windows 10 Server
> Datacenter.
>
>
>
> So the percentages I shared are for just the QUIC send path (again, no PNE
> with these numbers). The numbers are for the percentage of total CPU for
> all packets sent, but since all packets were pretty much the same size, the
> numbers should still hold for a single packet. And bottom line, encryption
> is a lot more of an impact than 10%. And as we bring in more performance
> improvements for UDP send path in Windows, we expect encryption to be
> higher and higher percentage.
>
>
>
> - Nick
>
>
>
> *From:* Kazuho Oku <kazuhooku@gmail.com>
> *Sent:* Saturday, June 23, 2018 12:00 AM
> *To:* Nick Banks <nibanks@microsoft.com>
> *Cc:* Ian Swett <ianswett@google.com>; IETF QUIC WG <quic@ietf.org>;
> Praveen Balasubramanian <pravb@microsoft.com>
>
> *Subject:* Re: Packet Number Encryption Performance
>
>
>
> IIUC, Ian should be talking about performance numbers that you would see
> on a server that is handling real workload. Not the breakdown of numbers
> within the QUIC stack.
>
>
>
> Let's talk about nanoseconds per packet rather than the ratio, because
> ratio depends on what you use as a "total" is.
>
>
>
> My benchmarks tell me that, on OpenSSL 1.1.0 running on Core i7 @ 2.5GHz
> (MacBook Pro 15" Mid 2015), the numbers are:
>
>
>
> without-PNE: 585ns / packet
>
> with-PNE: 607ns / packet (* this is the optimized version with 3.8%
> overhead)
>
>
>
> Assuming that your peak rate is 1Gbit/CPU core (would be enough to
> saturate 25Gb Ethernet), the ratio of CPU cycles for running the crypto
> will be:
>
>
>
> all crypto: 0.125GB/sec / 1280bytes/packet * 607ns/packet = 5.93 %
>
> PNE: 0.125GB/sec / 1280bytes/packet * 22.4ns/packet = 0.22 %
>
>
>
> Of course, these numbers are what is expected when a server is sending or
> receiving full-sized packets in one direction. In reality, there would be
> at least some flow in the opposite direction, so the number will be higher,
> but not as much as 2x. Or if you are sending small-sized packet in *both*
> directions *and* saturating the link, the ratio will be higher.
>
>
>
> But looking at the numbers, I'd assume that 10% is a logical number for
> workload Ian has, and also for people who are involved in content
> distribution.
>
>
>
> As shown, I think that sharing the PPS and average packet size that our
> workload will have, along with the raw numbers of the crypto (i.e. nsec /
> packet) will give us a better understanding on how much the actual overhead
> is. Do you mind sharing your expectations?
>
>
>
> 2018-06-23 9:34 GMT+09:00 Nick Banks <nibanks@microsoft.com>:
>
> Hey Guys,
>
>
>
> I spent the better part of the day getting some performance traces of
> WinQuic *without PNE*. Our implementation supports both user mode and
> kernel mode, so I got numbers for both. The following table shows the
> relative CPU cost of different parts of the QUIC send path:
>
>
>
>
>
> *User Mode*
>
> *Kernel Mode*
>
> *UDP Send*
>
> 64.7%
>
> 41.4%
>
> *Encryption*
>
> 22.5%
>
> 38.5%
>
> *Packet Building/Framing*
>
> 7%
>
> 15%
>
> *Miscellaneous*
>
> 5.8%
>
> 5.1%
>
>
>
> These numbers definitely show that encryption is a much larger portion of
> CPU.
>
>
>
> - Nick
>
>
>
>
>
> *From:* Kazuho Oku <kazuhooku@gmail.com>
> *Sent:* Friday, June 22, 2018 5:02 PM
> *To:* Ian Swett <ianswett@google.com>
> *Cc:* Nick Banks <nibanks@microsoft.com>; Praveen Balasubramanian <
> pravb@microsoft.com>; IETF QUIC WG <quic@ietf.org>
>
>
> *Subject:* Re: Packet Number Encryption Performance
>
>
>
>
>
>
>
> 2018-06-23 3:51 GMT+09:00 Ian Swett <ianswett@google.com>:
>
> I expect crypto to increase as a fraction of CPU, but I don't expect it to
> go much higher than 10%.
>
>
>
> But who knows, maybe 2 years from now everything else will be very
> optimized and crypto will be 15%?
>
>
>
> Ian, thank you for sharing the proportion of CPU cycles we are likely to
> spend for crypto.
>
>
>
> Your numbers relieves me, because even if the cost of crypto goes to 15%,
> the overhead of PNE will be less than 1% (0.15*0.04=0.006).
>
>
>
> I would also like to note that it is likely that HyperThreading, when
> used, will eliminate the overhead of PNE.
>
>
>
> This is because IIUC PNE is a marginal additional use of the AES-NI
> engine, which have been mostly idle. The overhead of crypto is small (i.e.
> 15%) that we will rarely see contention on the engine. While one
> hyperthread does AES, the other hyperthread will run at full speed doing
> other operations.
>
>
>
> Also considering the fact that the number of CPU cycles spent per QUIC
> packet does not change a lot with PNE, I would not be surprised to see *no*
> decrease of throughput when PNE is used on a HyperThreading architecture.
> In such case, what we will only observe is the raise of the utilization
> ratio of the AES-NI engine.
>
>
>
>
>
> On Fri, Jun 22, 2018 at 12:34 PM Nick Banks <nibanks=40microsoft.com@
> dmarc.ietf.org> wrote:
>
> I just want to add, my implementation already uses ECB from bcrypt (and I
> do the XOR) already. Bcrypt doesn’t expose CTR mode directly.
>
>
>
> Sent from Mail
> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgo.microsoft.com%2Ffwlink%2F%3FLinkId%3D550986&data=02%7C01%7Cnibanks%40microsoft.com%7Ce6efa363bc2547167a8c08d5d89c80e7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636653089059268836&sdata=aeU7j5lhCIM18KG5c6opG4p7amwhCWbRnfnm8sDW7hc%3D&reserved=0>
> for Windows 10
>
>
> ------------------------------
>
> *From:* Praveen Balasubramanian
> *Sent:* Friday, June 22, 2018 9:26:44 AM
> *To:* Ian Swett; Kazuho Oku
> *Cc:* Nick Banks; IETF QUIC WG
> *Subject:* RE: Packet Number Encryption Performance
>
>
>
> Ian, do you expect that fraction of overall cost to hold once the UDP
> stack is optimized? Is your measurement on top of the recent kernel
> improvements? I expect crypto fraction of overall cost to keep increasing
> over time as the network stack bottlenecks are eliminated.
>
>
>
> Kazuho, should the draft describe the optimizations you are making? Or are
> these are too OpenSSL specific?
>
>
>
> *From:* QUIC [mailto:quic-bounces@ietf.org] *On Behalf Of *Ian Swett
> *Sent:* Friday, June 22, 2018 4:45 AM
> *To:* Kazuho Oku <kazuhooku@gmail.com>
> *Cc:* Nick Banks <nibanks@microsoft.com>; IETF QUIC WG <quic@ietf.org>
> *Subject:* Re: Packet Number Encryption Performance
>
>
>
> Thanks for digging into the details of this, Kazuho.  <4% increase in
> crypto cost is a bit more than I originally expected(~2%), but crypto is
> less than 10% of my CPU usage, so it's still less than 0.5% total, which is
> acceptable to me.
>
>
>
> On Fri, Jun 22, 2018 at 2:45 AM Kazuho Oku <kazuhooku@gmail.com> wrote:
>
>
>
>
>
> 2018-06-22 12:22 GMT+09:00 Kazuho Oku <kazuhooku@gmail.com>:
>
>
>
>
>
> 2018-06-22 11:54 GMT+09:00 Nick Banks <nibanks@microsoft.com>:
>
> Hi Kazuho,
>
>
>
> Thanks for sharing your numbers as well! I'm bit confused where you say
> you can reduce the 10% overhead to 2% to 4%. How do you plan on doing that?
>
>
>
> As stated in my previous mail, the 10% of overhead consists of three
> parts, each consuming comparable number of CPU cycles. The two among the
> three is related to the abstraction layer and how CTR is implemented, while
> the other one is the core AES-ECB operation cost.
>
>
>
> It should be able to remove the costly abstraction layer.
>
>
>
> It should also be possible to remove the overhead of CTR, since in PNE, we
> need to XOR at most 4 octets (applying XOR is the only difference between
> CTR and ECB). That cost should be something that should be possible to be
> nullified.
>
>
>
> Considering these aspects, and by looking at the numbers on the OpenSSL
> source code (as well as considering the overhead of GCM), my expectation
> goes to 2% to 4%.
>
>
>
> Just did some experiments and it seems that the expectation was correct.
>
>
>
> The benchmarks tell me that the overhead goes down from 10.0% to 3.8%, by
> doing the following:
>
>
>
> * remove the overhead of CTR abstraction (i.e. use the ECB backend and do
> XOR by ourselves)
>
> * remove the overhead of the abstraction layer (i.e. call the method
> returned by EVP_CIPHER_meth_get_do_cipher instead of calling
> EVP_EncryptUpdate)
>
>
>
> Of course the changes are specific to OpenSSL, but I would expect that you
> can expect similar numbers assuming that you have access to an optimized
> AES implementation.
>
>
>
>
>
>
>
> Sent from my Windows 10 phone
>
> [HxS - 15254 - 16.0.10228.20075]
>
>
> ------------------------------
>
> *From:* Kazuho Oku <kazuhooku@gmail.com>
> *Sent:* Thursday, June 21, 2018 7:21:17 PM
> *To:* Nick Banks
> *Cc:* quic@ietf.org
> *Subject:* Re: Packet Number Encryption Performance
>
>
>
> Hi Nick,
>
>
>
> Thank you for bringing the numbers to the list.
>
>
>
> I have just run a small benchmark using Quicly, and I see comparable
> numbers.
>
>
>
> To be precise, I see 10.0% increase of CPU cycles when encrypting a
> Initial packet of 1,280 octets. I expect that we will see similar numbers
> on other QUIC stacks that also use picotls (with OpenSSL as a backend).
> Note that the number is only comparing the cost of encryption, the overhead
> ratio will be much smaller if we look at the total number of CPU cycles
> spent by a QUIC stack as a whole.
>
>
>
> Looking at the profile, the overhead consists of three operations that
> each consumes comparable CPU cycles: core AES operation (using AES-NI), CTR
> operation overhead, CTR initialization. Note that picotls at the moment
> provides access to CTR crypto beneath the AEAD interface, which is to be
> used by the QUIC stacks.
>
>
>
> I would assume that we can cut down the overhead to somewhere between 2%
> to 4%, but it might be hard to go down to somewhere near 1%, because we
> cannot parallelize the AES operation of PNE with that of AEAD (see
> https://github.com/openssl/openssl/blob/OpenSSL_1_1_0h/
> crypto/aes/asm/aesni-x86_64.pl#L24-L39
> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fopenssl%2Fopenssl%2Fblob%2FOpenSSL_1_1_0h%2Fcrypto%2Faes%2Fasm%2Faesni-x86_64.pl%23L24-L39&data=02%7C01%7Cnibanks%40microsoft.com%7C11d55f17333e4a795d7008d5d7e6d93c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636652308843994134&sdata=kqMz4SsN%2F2ErGK06Qz8Z0vUzpl4MiipnNE2wAMUb46c%3D&reserved=0>
> about the impact of parallelization).
>
>
>
> I do not think that 2% to 4% of additional overhead to the crypto is an
> issue for QUIC/HTTP, but current overhead of 10% is something that we might
> want to decrease. I am glad to be able to learn that now.
>
>
>
>
>
> 2018-06-22 5:48 GMT+09:00 Nick Banks <nibanks=40microsoft.com@
> dmarc.ietf.org>:
>
> Hello QUIC WG,
>
>
>
> I recently implemented PNE for WinQuic (using bcrypt APIs) and I decided
> to get some performance numbers to see what the overhead of PNE was. I
> figured the rest of the WG might be interested.
>
>
>
> My test just encrypts the same buffer (size dependent on the test case)
> 10,000,000 times and measured the time it took. The test then did the same
> thing, but also encrypted the packet number as well. I ran all that 10
> times in total. I then collected the best times for each category to
> produce the following graphs and tables (full excel doc attached):
>
>
>
>
>
>
>
> *Time (ms)*
>
> *Rate (Mbps)*
>
> *Bytes*
>
> *NO PNE*
>
> *PNE*
>
> *PNE Overhead*
>
> *No PNE*
>
> *PNE*
>
> *4*
>
> 2284.671
>
> 3027.657
>
> 33%
>
> 140.064
>
> 105.692
>
> *16*
>
> 2102.402
>
> 2828.204
>
> 35%
>
> 608.827
>
> 452.584
>
> *64*
>
> 2198.883
>
> 2907.577
>
> 32%
>
> 2328.45
>
> 1760.92
>
> *256*
>
> 2758.3
>
> 3490.28
>
> 27%
>
> 7424.86
>
> 5867.72
>
> *600*
>
> 4669.283
>
> 5424.539
>
> 16%
>
> 10280
>
> 8848.68
>
> *1000*
>
> 6130.139
>
> 6907.805
>
> 13%
>
> 13050.3
>
> 11581.1
>
> *1200*
>
> 6458.679
>
> 7229.672
>
> 12%
>
> 14863.7
>
> 13278.6
>
> *1450*
>
> 7876.312
>
> 8670.16
>
> 10%
>
> 14727.7
>
> 13379.2
>
>
>
> I used a server grade lab machine I had at my disposal, running the latest
> Windows 10 Server DataCenter build. Again, these numbers are for crypto
> only. No QUIC or UDP is included.
>
>
>
> Thanks,
>
> - Nick
>
>
>
>
>
>
>
> --
>
> Kazuho Oku
>
>
>
>
>
> --
>
> Kazuho Oku
>
>
>
>
>
> --
>
> Kazuho Oku
>
>
>
>
>
> --
>
> Kazuho Oku
>
>
>
>
>
> --
>
> Kazuho Oku
>



-- 
Kazuho Oku