# Category Archives: C++

## Sorting Localized Strings

Sorting becomes more complicated when you’re working in different local language strings, and the standard library solves this problem. The facet collate provides a member function compare that works like strcmp: it returns -1 if the first string is less than the second, 0 if they are equal, and 1 if the first string is greater than the second. Unlike strcmp, collate::compare uses the character semantics of the target locale.

Below code presents the function localeLessThan, which returns true if the first argument is less than the second according to the global locale. The most important part of the function is the call to compare:

col.compare(pb1, // Pointer to the first char

pb1 + s1.size( ), // Pointer to one past the last char

pb2,

pb2 + s2.size( ))

Depending on the execution character set of your implementation, below code may return the results I showed earlier or not. But if you want to ensure string comparison works in a locale-specific manner, you should use collate::compare. Of course, the standard does not require an implementation to support any locales other than “C,” so be sure to test for all the locales you support.

The locale class has built-in support for comparing characters in a given locale by overriding operator. You can use an instance of the locale class as your comparison functor when you call any standard function that takes a functor for comparison.

#include <iostream> #include <locale> #include <string> #include <vector> #include <algorithm> using namespace std; bool localeLessThan (const string& s1, const string& s2) { const collate<char>& col = use_facet<collate<char> >(locale( )); // Use the global locale const char* pb1 = s1.data( ); const char* pb2 = s2.data( ); return (col.compare(pb1, pb1 + s1.size( ), pb2, pb2 + s2.size( )) < 0); } int main( ) { // Create two strings, one with a German character string s1 = "diät"; string s2 = "dich"; vector<string> v; v.push_back(s1); v.push_back(s2); // Sort without giving a locale, which will sort according to the current global locale's rules. sort(v.begin( ), v.end( )); for (vector<string>::const_iterator p = v.begin( ); p != v.end( ); ++p) cout << *p << endl; // Set the global locale to German, and then sort locale::global(locale("german")); sort(v.begin( ), v.end( ), localeLessThan); for (vector<string>::const_iterator p = v.begin( ); p != v.end( ); ++p) cout << *p << endl; }

The first sort follows ASCII sorting convention, and therefore the output looks like

this:

dich

diät

The second sort uses the proper ordering according to German semantics, and it is just the opposite:

diät

dich

###### Related articles

- Performing Arithmetic on Bitsets in C++ (alikhuram.wordpress.com)
- Binary Search Tree template Class (BST) (alikhuram.wordpress.com)
- Representing Large Fixed-Width Integers in C++ (alikhuram.wordpress.com)
- Implementing Fixed-Point Numbers in C++ (alikhuram.wordpress.com)
- Template Compare iterator (stackoverflow.com)
- Go 1.1 performance improvements (dave.cheney.net)
- Updates and responses to reader feedback (rachelbythebay.com)
- GotW #3 Solution: Using the Standard Library (or, Temporaries Revisited) (herbsutter.com)

## Hardcoding a Unicode String in C++

Hard coding a Unicode string is mostly a matter of deciding how you want to enter the string in your source editor. C++ provides a wide-character type, wchar_t, which can store Unicode strings. The exact implementation of wchar_t is implementation defined, but it is often UTF-32. The class wstring, defined in , is a sequence of wchar_ts, just like the string class is a sequence of chars. (Strictly speaking, of course, wstring is a typedef for basic_string<wchar_t>).

The easiest way to enter Unicode characters is to use the L prefix to a string literal, as stated in below code:

wstring ws1 = L”Infinity: \u2210″; // Use the code itself

wstring ws2 = L”Euro: “; // Or just type it in

Now, you can write these wide-character strings to a wide-character stream, like this:

wcout << ws1 << endl; // wcout is the wide char version of cout

This goes for files, too:

wofstream out(“tmp\\unicode.txt”);

out << ws2 << endl;

The trickiest part of dealing with different character encodings isn’t embedding the right characters in your source files, it’s knowing what kind of character data you are getting back from a database, HTTP request, user input, and so on, and this is beyond the realm of the C++ standard. The C++ standard does not require a particular encoding, rather that the character encoding used by your operating system to store source files can be anything, as long as it supports at least the 96 characters used by the C++ language. For characters that are not part of this character set, called the basic source character set, the standard indicates that they must be available by using the \uXXXX or \UXXXXXXXX escape sequences, where each X is a hexadecimal digit.

Do this by hard coding the string with a prefix of L and typing the character into your source editor as you would any other string, or use the hexadecimal number that represents the Unicode character you’re after. Below code shows how to do it both ways.

#include <iostream>

#include <fstream>

#include <string>

using namespace std;

int main( )

{

// Create some strings with Unicode characters

wstring ws1 = L"Infinity: \u221E";

wstring ws2 = L"Euro: _";

wchar_t w[] = L"Infinity: \u221E";

wofstream out("tmp\\unicode.txt");

out << ws2 << endl;

wcout << ws2 << endl;

}

###### Related articles

- Unicode Twitter Usernames (higg.im)
- Unicode text processing (3.3) (kenmxu.wordpress.com)
- Convert number from Unicode to natural integer representation C#.NET (stackoverflow.com)

## Generating Random Numbers

To be precise, random number generation functions, including rand, return pseudo-random numbers as opposed to truly random numbers, so whenever I say random, I actually mean pseudo-random. Before using the rand function you need to seed (i.e., initialize) the random number generator with a call to srand. This assures that subsequent calls to rand won’t produce the same sequence of numbers each time the program is run. The simplest way to seed the random number generator is to pass the result from a call to clock from the header as an unsigned int. Reseeding a random number generator causes number generation to be less random.

The rand function is limited in many ways. To begin with, it only generates integers, and only does so using a uniform distribution. Furthermore, the specific random number generation algorithm used is implementation specific and, thus, random number sequences are not reproducible from system to system given the same seed. This is a problem for certain kinds of applications, as well as when testing and debugging.

If you want to generate some random floating-point numbers in the interval of [0.0, 1.0) with a uniform distribution. The C++ standard provides the C runtime function rand in the header that returns a random number in the range of 0 to RAND_MAX inclusive. The RAND_MAX macro represents the highest value returnable by the rand function. A demonstration of

using rand to generate random floating-point numbers below.

#include <cstdlib> #include <ctime> #include <iostream> using namespace std; double doubleRand( ) { return double(rand( )) / (double(RAND_MAX) + 1.0); } int main( ) { srand(static_cast(clock( ))); cout << "expect 5 numbers within the interval [0.0, 1.0)" << endl; for (int i=0; i < 5; i++) { cout << doubleRand( ) << "\n"; } cout << endl; }

The program above should produce output similar to:

expect 5 numbers within the interval [0.0, 1.0)

0.010437

0.740997

0.34906

0.369293

0.544373

A much more sophisticated alternative to rand is the Boost Random library by Jens Maurer. The Boost Random library provides several high-quality random number generation functions for both integer and floating-point types, and support for numerous kinds

of distributions. Below code demonstrates how you can produce random floating- point numbers in the interval [0,1).

#include <boost/random.hpp> #include <iostream> #include <cstdlib> using namespace std; using namespace boost; typedef boost::mt19937 BaseGenerator; typedef boost::uniform_real Distribution; typedef boost::variate_generator<basegenerator, distribution=""> Generator; double boostDoubleRand( ) { static BaseGenerator base; static Distribution dist; static Generator rng(base, dist); return rng( ); } int main( ) { cout << "expect 5 numbers within the interval [0,1)" << endl; for (int i=0; i < 5; i++) { cout << boostDoubleRand( ) << "\n"; } cout << endl; }

The main advantage of the Boost Random library, is that the pseudo-random number generation algorithm has guaranteed and reproducible randomness properties based on the precise algorithm chosen. In above code I use the Mersenne Twister generator (mt19937) because it offers a good blend of performance and randomness.

###### Related articles

- C++ – Fisher-Yates Shuffle (bradenlenz.com)
- C++ – Rock Paper Scissors Game (bradenlenz.com)
- Write Random Numbers to a File and Read Into Vector in C++ (grahamscode.wordpress.com)
- How to Generate Random Numbers in Java between Range – Example Tutorial (javarevisited.blogspot.com)
- Representing Large Fixed-Width Integers in C++ (alikhuram.wordpress.com)
- random number generator error c++ MFC (stackoverflow.com)
- Random Numbers in C (coffeetoma.wordpress.com)
- Generating two independentely seeded random number lists in c (stackoverflow.com)

## Implementing Fixed-Point Numbers in C++

A fixed-point number, like a floating-point number, is an approximate representation of a real number. A floating-point number is stored as a mantissa (m), and an exponent (e), to form the equation m * be, where b is some constant. A fixed-point number is almost the same but the exponent is also a constant. This constant is passed to the basic_fixed_real in below class template as a template parameter.

By representing e as a constant, it allows fixed-point numbers to be represented internally as integers and for the arithmetic operations on them to be performed using integer arithmetic. This can often improve the speed of basic arithmetic operations

especially addition and subtraction. Fixed-point representations are less flexible than floating-point numbers, as they can only represent a narrow range of values. The fixed_real type in below function has a range that can only represent values from –2,097,151 to +2,097,151 with a precision of 1/1,024.

Implementing addition and subtraction of fixed-point numbers is straightforward enough: I simply add or subtract the underlying representation. To perform division and multiplication, I need an extra step of shifting the mantissa left or right to adjust for the binary point.

Below class template provides the implementation of a fixed-point real number, where the number of places to the right of the binary point is a template parameter. For instance basic_fixed_real<10> has 10 binary digits to the right of the binary point, allowing it to represent numbers up to a precision of 1/1,024.

#include <iostream> using namespace std; template<int E> struct BasicFixedReal { typedef BasicFixedReal self; static const int factor = 1 << (E - 1); BasicFixedReal( ) : m(0) { } BasicFixedReal(double d) : m(static_cast(d * factor)) { } self& operator+=(const self& x) { m += x.m; return *this; } self& operator-=(const self& x) { m -= x.m; return *this; } self& operator*=(const self& x) { m *= x.m; m >>= E; return *this; } self& operator/=(const self& x) { m /= x.m; m *= factor; return *this; } self& operator*=(int x) { m *= x; return *this; } self& operator/=(int x) { m /= x; return *this; } self operator-( ) { return self(-m); } double toDouble( ) const { return double(m) / factor; } // friend functions friend self operator+(self x, const self& y) { return x += y; } friend self operator-(self x, const self& y) { return x -= y; } friend self operator*(self x, const self& y) { return x *= y; } friend self operator/(self x, const self& y) { return x /= y; } // comparison operators friend bool operator==(const self& x, const self& y) { return x.m == y.m; } friend bool operator!=(const self& x, const self& y) { return x.m != y.m; } friend bool operator>(const self& x, const self& y) { return x.m > y.m; } friend bool operator<(const self& x, const self& y) { return x.m < y.m; } friend bool operator>=(const self& x, const self& y) { return x.m >= y.m; } friend bool operator<=(const self& x, const self& y) { return x.m <= y.m; } private: int m; }; typedef BasicFixedReal<10> FixedReal; int main( ) { FixedReal x(0); for (int i=0; i < 100; ++i) { x += FixedReal(0.0625); } cout << x.toDouble( ) << endl; }

The program outputs:

6.25

###### Related articles

- Representing Large Fixed-Width Integers in C++ (alikhuram.wordpress.com)
- Performing Arithmetic on Bitsets in C++ (alikhuram.wordpress.com)
- Binary Search Tree template Class (BST) (alikhuram.wordpress.com)

## Threaded Binary Search Tree in C++

/*********************************************************************************************************

* Threaded Binary Search Tree Implementation *

* Programmer: Graham Nedelka *

* Created in April, 2013 *

* Computer Science 116: Data Structures *

*********************************************************************************************************/

#include <iostream>

#include <iomanip>

using namespace std;

struct bstNode{

// This constructs a ‘node’ container to serve as reference points for a threaded binary search

// tree. Theoretically, each node will point to another container known as a ‘child.’ In this case

// the children are located to the left and to the right of the current node. The nodes to the left

// of the current node contain integers smaller than itself, and larger to the right. What differentiates

// this implementation from binary search tree is the ‘threaded’ aspect, where the nodes within the

// in-order binary tree point to their predecessors, instead of pointing to NULL by default. This allows

// for a shorter processing time in retrieval of data…

View original post 905 more words

## Self-Sorting Linked List in C++

/*********************************************************************************************************

* Self Sorting Linked List *

* Programmer: Graham Nedelka *

* Created in May, 2013 *

* Computer Science 116: Data Structures *

*********************************************************************************************************/

#include <iostream>

#include <time.h>

#include <fstream>

#include <iomanip>

#include <vector>

using namespace std;

struct node{

int data;

node *next;

node *previous;

public:

node(int value){data = value; next = NULL; previous = NULL;};

};

class LL{

node *head;

node *tail;

public:

LL() {head = NULL;tail = NULL;};

void add(int number);

void show();

};

void LL::add(int value){

node *current = new node(value);

if (head == NULL){

tail = current;

head = current;

}

else{

node *temp = head;

node *temptail = tail;

if (value < temp->data){

node *temp2 = new node(value);

temp2->next = head;

head = temp2;

return;

}

while(temp->next != NULL){

if (value > temp->data and value < temp->next->data){

current->next = temp->next;

temp->next = current;

temp = temp->next;

}

else if (current->data > temptail->data){

temptail->next…

View original post 137 more words

## Representing Large Fixed-Width Integers in C++

Large integers are common in many applications. In cryptography, for example, integers of 1,000 bits and larger are not uncommon. The implementation I presented here is based on a binary representation of numbers using a bitset, at a cost of some performance. What I lost in performance I more than made up for in simplicity. A more efficient implementation of arbitrary precision numbers could easily fill the book.

The BigInt template below uses the bitset from the header to (from my last post) allow you to represent unsigned integers using a fixed number of bits specified as a template parameter.

//big_int.hpp #ifndef BIG_INT_HPP #define BIG_INT_HPP #include <bitset> #include "bitset_arithmetic.hpp" // from my last post of bitset manipulation. template<unsigned int N> class BigInt { typedef BigInt self; public: BigInt( ) : bits( ) { } BigInt(const self& x) : bits(x.bits) { } BigInt(unsigned long x) { int n = 0; while (x) { bits[n++] = x & 0x1; x >>= 1; } } explicit BigInt(const std::bitset& x) : bits(x) { } // public functions bool operator[](int n) const { return bits[n]; } unsigned long toUlong( ) const { return bits.to_ulong( ); } // operators self& operator<<=(unsigned int n) { bits <<= n; return *this; } self& operator>>=(unsigned int n) { bits >>= n; return *this; } self operator++(int) { self i = *this; operator++( ); return i; } self operator--(int) { self i = *this; operator--( ); return i; } self& operator++( ) { bool carry = false; bits[0] = fullAdder(bits[0], 1, carry); for (int i = 1; i < N; i++) { bits[i] = fullAdder(bits[i], 0, carry); } return *this; } self& operator--( ) { bool borrow = false; bits[0] = fullSubtractor(bits[0], 1, borrow); for (int i = 1; i < N; i++) { bits[i] = fullSubtractor(bits[i], 0, borrow); } return *this; } self& operator+=(const self& x) { bitsetAdd(bits, x.bits); return *this; } self& operator-=(const self& x) { bitsetSubtract(bits, x.bits); return *this; } self& operator*=(const self& x) { bitsetMultiply(bits, x.bits); return *this; } self& operator/=(const self& x) { std::bitset tmp; bitsetDivide(bits, x.bits, bits, tmp); return *this; } self& operator%=(const self& x) { std::bitset tmp; bitsetDivide(bits, x.bits, tmp, bits); return *this; } self operator~( ) const { return ~bits; } self& operator&=(self x) { bits &= x.bits; return *this; } self& operator|=(self x) { bits |= x.bits; return *this; } self& operator^=(self x) { bits ^= x.bits; return *this; } // friend functions friend self operator<<(self x, unsigned int n) { return x <<= n; } friend self operator>>(self x, unsigned int n) { return x >>= n; } friend self operator+(self x, const self& y) { return x += y; } friend self operator-(self x, const self& y) { return x -= y; } friend self operator*(self x, const self& y) { return x *= y; } friend self operator/(self x, const self& y) { return x /= y; } friend self operator%(self x, const self& y) { return x %= y; } friend self operator^(self x, const self& y) { return x ^= y; } friend self operator&(self x, const self& y) { return x &= y; } friend self operator|(self x, const self& y) { return x |= y; } // comparison operators friend bool operator==(const self& x, const self& y) { return x.bits == y.bits; } friend bool operator!=(const self& x, const self& y) { return x.bits != y.bits; } friend bool operator>(const self& x, const self& y) { return bitsetGt(x.bits, y.bits); } friend bool operator<(const self& x, const self& y) { return bitsetLt(x.bits, y.bits); } friend bool operator>=(const self& x, const self& y) { return bitsetGtEq(x.bits, y.bits); } friend bool operator<=(const self& x, const self& y) { return bitsetLtEq(x.bits, y.bits); } private: std::bitset<N> bits; };

The BigInt template class could be used to represent factorials, as shown below,

//Using the big_int class #include "big_int.hpp" #include <iostream> #include <vector> #include <iterator> #include <algorithm> using namespace std; void outputBigInt(BigInt<1024> x) { vector<int> v; if (x == 0) { cout << 0; return; } while (x > 0) { v.push_back((x % 10).to_ulong( )); x /= 10; } copy(v.rbegin( ), v.rend( ), ostream_iterator(cout, "")); cout << endl; } int main( ) { BigInt<1024> n(1); // compute 32 factorial for (int i=1; i <= 32; ++i) { n *= i; } outputBigInt(n); }

The program output is:

263130836933693530167218012160000000

###### Related articles

- Performing Arithmetic on Bitsets in C++ (alikhuram.wordpress.com)
- Binary Search Tree template Class (BST) (alikhuram.wordpress.com)
- Comparisons in C++ (davidcorne.com)
- C++ Primer Plus (6th ed.) CH 3: Dealing with Data (goneim.wordpress.com)