Category Archives: C++

Comparing modern C++ and Rust in terms of safety and performance

C++ and Rust are two popular programming languages that offer a balance between performance and safety. While C++ has been around for decades and is widely used in many applications, Rust is a relatively new language that has gained popularity for its safety features. In this post, we will compare the safety and performance of modern C++ and Rust.

Safety

Null Pointers:

One of the most common errors in C++ is null pointer dereferencing. A null pointer is a pointer that does not point to a valid memory location. Dereferencing a null pointer can result in a segmentation fault, which can crash the program. In C++, it is the responsibility of the programmer to ensure that pointers are not null before dereferencing them.

Rust, on the other hand, has a built-in option type that ensures that a value is either present or absent. This eliminates the need for null pointers. For example, consider the following C++ code:

int* ptr = nullptr;
*ptr = 42;

This code will crash with a segmentation fault, because the pointer ptr is null. In Rust, we would write this code as follows:

let mut opt = None; 
opt = Some(42);

In this case, opt is an option type that is either Some with a value or None. If we try to access opt when it is None, Rust will throw a runtime error. This eliminates the possibility of null pointer dereferencing.

Data Races:

Another common error in concurrent programming is data races. A data race occurs when two or more threads access the same memory location simultaneously, and at least one of the accesses is a write. This can result in undefined behavior, including incorrect results or crashes.

C++ has a number of synchronization primitives that can be used to prevent data races, such as mutexes and condition variables. However, it is the responsibility of the programmer to ensure that these primitives are used correctly. In Rust, the ownership and borrowing system ensures that only one thread can modify a value at a time, preventing data races. For example, consider the following C++ code:

#include <thread>
#include <iostream>

int x = 0;
void increment() 
{
  for (int i = 0; i < 1000000; i++) 
  { x++; } 
} 

int main() 
{
  std::thread t1(increment);
  std::thread t2(increment);
  t1.join();
  t2.join();
  std::cout << x << std::endl; 
}

In this code, we have two threads that increment the variable x simultaneously. This can result in a data race, because both threads are modifying the same memory location. In Rust, we can use the ownership and borrowing system to prevent this kind of error:

use std::sync::Mutex;

let x = Mutex::new(0);

let t1 = std::thread::spawn(|| {
    for _ in 0..1000000 {
        let mut x = x.lock().unwrap();
        *x += 1;
    }
});

let t2 = std::thread::spawn(|| {
    for _ in 0..1000000 {
        let mut x = x.lock().unwrap();
        *x += 1;
    }
});

t1.join().unwrap();
t2.join().unwrap();

println!("{}", *x.lock().unwrap());

In this code, we use a mutex to ensure that only one thread can modify x at a time. This prevents data races and ensures that the output is correct.

Performance:

Compilation Time: One area where C++ has traditionally been faster than Rust is in compilation time. C++ compilers have been optimized for decades to compile large code bases quickly. However, Rust has been improving its compilation times with recent releases.

For example, consider compiling the following C++ code:

#include <iostream>

int main() {
    std::cout << "Hello, world!" << std::endl;
    return 0;
}

Using the Clang compiler on a MacBook Pro with a 2.6 GHz Intel Core i7 processor, this code takes about 0.5 seconds to compile. In comparison, compiling the equivalent Rust code:

fn main() {
    println!("Hello, world!");
}

Using the Rust compiler takes about 0.8 seconds. While C++ is faster in this example, Rust is improving its compilation times and is expected to become more competitive in the future.

Execution Time:

When it comes to execution time, C++ and Rust are both high-performance languages that can be used to write fast, low-level code. However, the specific performance characteristics of each language can vary depending on the task at hand.

For example, consider the following C++ code that calculates the sum of all integers from 1 to 1 billion:

#include <iostream>

int main() {
    long long sum = 0;
    for (int i = 1; i <= 1000000000; i++) {
        sum += i;
    }
    std::cout << sum << std::endl;
    return 0;
}

Using the Clang compiler on a MacBook Pro with a 2.6 GHz Intel Core i7 processor, this code takes about 5 seconds to execute. In comparison, the equivalent Rust code:

fn main() {
    let mut sum = 0;
    for i in 1..=1000000000 {
        sum += i;
    }
    println!("{}", sum);
}

Using the Rust compiler takes about 6 seconds to execute on the same machine. In this case, C++ is slightly faster than Rust, but the difference is not significant.

Memory Usage:

Another important factor to consider when comparing performance is memory usage. C++ and Rust both provide low-level control over memory, which can be beneficial for performance. However, this also means that the programmer is responsible for managing memory and avoiding memory leaks.

For example, consider the following C++ code that creates a large array of integers:

#include <iostream>
#include <vector>

int main() {
    std::vector<int> arr(100000000);
    std::cout << arr.size() << std::endl;
    return 0;
}

This code creates a vector of 100 million integers, which takes up about 400 MB of memory. In comparison, the equivalent Rust code:

fn main() {
    let arr = vec![0; 100000000];
    println!("{}", arr.len());
}

Using the Rust compiler takes up about 400 MB of memory. In this case, C++ and Rust have similar memory usage.

Conclusion

Both C++ and Rust offer a balance between performance and safety, but Rust’s ownership and borrowing system gives it an edge in terms of safety. Rust’s memory safety features make it less prone to common programming errors like null pointer dereferencing, data races, and memory leaks.

In terms of performance, both C++ and Rust are high-performance languages, but Rust’s memory safety features and zero-cost abstractions give it an advantage over C++. Rust’s efficient concurrency also makes it well-suited for high-performance and parallel computing.

Ultimately, the choice between C++ and Rust depends on the specific needs of your project. If you need low-level memory management and direct hardware access, C++ may be the better choice. If you prioritize safety and memory efficiency, Rust may be the way to go.

Posted in C++, C++11, Multithreading, Programming

Leave a comment

Tags: C++, Code optimization, Comparative analysis, Execution time, High-performance languages, Low-level control, Memory management, Memory usage, Performance, Programming languages, Rust, Speed optimization

Top 10 Most Common C++ Mistakes That Developers Make

Jul 7

Posted by Khuram Ali

There are many pitfalls that a C++ developer may encounter. This can make quality programming very hard and maintenance very expensive. Learning the language syntax and having good programming skills in similar languages, like C# and Java, just isn’t enough to utilize C++’s full potential. It requires years of experience and great discipline to avoid errors in C++. In this article, we are going to take a look at some of the common mistakes that are made by developers of all levels if they are not careful enough with C++ development.

Common Mistake #1: Using “new” and ”delete” Pairs Incorrectly

No matter how much we try, it is very difficult to free all dynamically allocated memory. Even if we can do that, it is often not safe from exceptions. Let us look at a simple example:

void SomeMethod()

{

ClassA *a = new ClassA;

SomeOtherMethod(); // it can throw an exception

delete a;

}

If an exception is thrown, the “a” object is never deleted. The following example shows a safer and shorter way to do that. It uses auto_ptr which is deprecated in C++11, but the old standard is still widely used. It can be replaced with C++11 unique_ptr or scoped_ptr from Boost if possible.

void SomeMethod()

{

std::auto_ptr<ClassA> a(new ClassA); // deprecated, please check the text

SomeOtherMethod(); // it can throw an exception

}

No matter what happens, after creating the “a” object it will be deleted as soon as the program execution exits from the scope.

However, this was just the simplest example of this C++ problem. There are many examples when deleting should be done at some other place, perhaps in an outer function or another thread. That is why the use of new/delete in pairs should be completely avoided and appropriate smart pointers should be used instead.

Common Mistake #2: Forgotten Virtual Destructor

This is one of the most common errors that leads to memory leaks inside derived classes if there is dynamic memory allocated inside them. There are some cases when virtual destructor is not desirable, i.e. when a class is not intended for inheritance and its size and performance is crucial. Virtual destructor or any other virtual function introduces additional data inside a class structure, i.e. a pointer to a virtual table which makes the size of any instance of the class bigger.

However, in most cases classes can be inherited even if it is not originally intended. So it is a very good practice to add a virtual destructor when a class is declared. Otherwise, if a class must not contain virtual functions due to performance reasons, it is a good practice to put a comment inside a class declaration file indicating that the class should not be inherited. One of the best options to avoid this issue is to use an IDE that supports virtual destructor creation during a class creation.

One additional point to the subject are classes/templates from the standard library. They are not intended for inheritance and they do not have a virtual destructor. If, for example, we create a new enhanced string class that publicly inherits from std::string there is possibility that somebody will use it incorrectly with a pointer or a reference to std::string and cause a memory leak.

class MyString : public std::string

{

~MyString() {

// …

}

};

int main()

{

std::string *s = new MyString();

delete s; // May not invoke the destructor defined in MyString

}

To avoid such C++ issues, a safer way of reusing of a class/template from the standard library is to use private inheritance or composition.

Common Mistake #3: Deleting an Array With “delete” or Using a Smart Pointer

Creating temporary arrays of dynamic size is often necessary. After they are not required anymore, it is important to free the allocated memory. The big problem here is that C++ requires special delete operator with [] brackets, which is forgotten very easily. The delete[] operator will not just delete the memory allocated for an array, but it will first call destructors of all objects from an array. It is also incorrect to use the delete operator without [] brackets for primitive types, even though there is no destructor for these types. There is no guarantee for every compiler that a pointer to an array will point to the first element of the array, so using delete without [] brackets can result in undefined behaviour too.

Using smart pointers, such as auto_ptr, unique_ptr<T>, shared_ptr, with arrays is also incorrect. When such a smart pointer exits from a scope, it will call a delete operator without [] brackets which results in the same issues described above. If using of a smart pointer is required for an array, it is possible to use scoped_array or shared_array from Boost or a unique_ptr<T[]> specialization.

If functionality of reference counting is not required, which is mostly the case for arrays, the most elegant way is to use STL vectors instead. They don’t just take care of releasing memory, but offer additional functionalities as well.

Common Mistake #4: Returning a Local Object by Reference

This is mostly a beginner’s mistake, but it is worth mentioning since there is a lot of legacy code that suffers from this issue. Let’s look at the following code where a programmer wanted to do some kind of optimization by avoiding unnecessary copying:

Complex& SumComplex(const Complex& a, const Complex& b)

{

Complex result;

…..

return result;

}

Complex& sum = SumComplex(a, b);

The object “sum” will now point to the local object “result”. But where is the object “result” located after the SumComplex function is executed? Nowhere. It was located on the stack, but after the function returned the stack was unwrapped and all local objects from the function were destructed. This will eventually result in an undefined behaviour, even for primitive types. To avoid performance issues, sometimes it is possible to use return value optimization:

Complex SumComplex(const Complex& a, const Complex& b)

{

return Complex(a.real + b.real, a.imaginar + b.imaginar);

}

Complex sum = SumComplex(a, b);

For most of today’s compilers, if a return line contains a constructor of an object the code will be optimized to avoid all unnecessary copying – the constructor will be executed directly on the “sum” object.

Common Mistake #5: Using a Reference to a Deleted Resource

These C++ problems happen more often than you may think, and are usually seen in multithreaded applications. Let us consider the following code:

Thread 1:

Connection& connection= connections.GetConnection(connectionId);

// …

Thread 2:

connections.DeleteConnection(connectionId);

// …

Thread 1:

connection.send(data);

In this example, if both threads used the same connection ID this will result in undefined behavior. Access violation errors are often very hard to find.

In these cases, when more than one thread accesses the same resource it is very risky to keep pointers or references to the resources, because some other thread can delete it. It is much safer to use smart pointers with reference counting, for example shared_ptr from Boost. It uses atomic operations for increasing/decreasing a reference counter, so it is thread safe.

Common Mistake #6: Allowing Exceptions to Leave Destructors

It is not frequently necessary to throw an exception from a destructor. Even then, there is a better way to do that. However, exceptions are mostly not thrown from destructors explicitly. It can happen that a simple command to log a destruction of an object causes an exception throwing. Let’s consider following code:

class A

{

public:

A(){}

~A()

{

writeToLog(); // could cause an exception to be thrown

}

};

// …

try

{

A a1;

A a2;

}

catch (std::exception& e)

{

std::cout << “exception caught”;

}

In the code above, if exception occurs twice, such as during the destruction of both objects, the catch statement is never executed. Because there are two exceptions in parallel, no matter whether they are of the same type or different typ

e the C++ runtime environment does not know how to handle it and calls a terminate function which results in termination of a program’s execution.

So the general rule is: never allow exceptions to leave destructors. Even if it is ugly, potential exception has to be protected like this:

try

{

writeToLog(); // could cause an exception to be thrown

}

catch (…) {}

Common Mistake #7: Using “auto_ptr” (Incorrectly)

The auto_ptr template is deprecated from C++11 because of a number of reasons. It is still widely used, since most projects are still being developed in C++98. It has a certain characteristic that is probably not familiar to all C++ developers, and could cause serious problems for somebody who is not careful. Copying of auto_ptr object will transfer an ownership from one object to another. For example, the following code:

auto_ptr<ClassA> a(new ClassA); // deprecated, please check the text

auto_ptr<ClassA> b = a;

a->SomeMethod(); // will result in access violation error

… will result in an access violation error. Only object “b” will contain a pointer to the object of Class A, while “a” will be empty. Trying to access a class member of the object “a” will result in an access violation error. There are many ways of using auto_ptr incorrectly. Four very critical things to remember about them are:

Never use auto_ptr inside STL containers. Copying of containers will leave source containers with invalid data. Some STL algorithms can also lead to invalidation of “auto_ptr”s.
Never use auto_ptr as a function argument since this will lead to copying, and leave the value passed to the argument invalid after the function call.
If auto_ptr is used for data members of a class, be sure to make a proper copy inside a copy constructor and an assignment operator, or disallow these operations by making them private.
Whenever possible use some other modern smart pointer instead of auto_ptr.

Common Mistake #8: Using Invalidated Iterators and References

It would be possible to write an entire book on this subject. Every STL container has some specific conditions in which it invalidates iterators and references. It is important to be aware of these details while using any operation. Just like the previous C++ problem, this one can also occur very frequently in multithreaded environments, so it is required to use synchronization mechanisms to avoid it. Lets see the following sequential code as an example:

vector<string> v;

v.push_back(“string1”);

string& s1 = v[0]; // assign a reference to the 1st element

vector<string>::iterator iter = v.begin(); // assign an iterator to the 1st element

v.push_back(“string2”);

cout << s1; // access to a reference of the 1st element

cout << *iter; // access to an iterator of the 1st element

From a logical point of view the code seems completely fine. However, adding the second element to the vector may result in reallocation of the vector’s memory which will make both the iterator and the reference invalid and result in an access violation error when trying to access them in the last 2 lines.

Common Mistake #9: Passing an Object by Value

You probably know that it is a bad idea to pass objects by value due to its performance impact. Many leave it like that to avoid typing extra characters, or probably think of returning later to do the optimization. It usually never gets done, and as a result leads to lesser performant code and code that is prone to unexpected behavior:

class A

{

public:

virtual std::string GetName() const {return “A”;}

…

};

class B: public A

{

public:

virtual std::string GetName() const {return “B”;}

…

};

void func1(A a)

{

std::string name = a.GetName();

…

}

B b;

func1(b);

This code will compile. Calling of the “func1” function will create a partial copy of the object “b”, i.e. it will copy only class “A”’s part of the object “b” to the object “a” (“slicing problem”). So inside the function it will also call a method from the class “A” instead of a method from the class “B” which is most likely not what is expected by somebody who calls the function.

Similar problems occur when attempting to catch exceptions. For example:

class ExceptionA: public std::exception;

class ExceptionB: public ExceptionA;

try

{

func2(); // can throw an ExceptionB exception

}

catch (ExceptionA ex)

{

writeToLog(ex.GetDescription());

throw;

}

When an exception of type ExceptionB is thrown from the function “func2” it will be caught by the catch block, but because of the slicing problem only a part from the ExceptionA class will be copied, incorrect method will be called and also re-throwing will throw an incorrect exception to an outside try-catch block.

To summarize, always pass objects by reference, not by value.

Common Mistake #10: Using User Defined Conversions by Constructor and Conversion Operators

Even the user defined conversions are very useful sometimes, but they can lead to unpredicted conversions that are very hard to locate. Let’s say somebody created a library that has a string class:

class String

{

public:

String(int n);

String(const char *s);

….

}

The first method is intended to create a string of a length n, and the second is intended to create a string containing the given characters. But the problem starts as soon as you have something like this:

String s1 = 123;

String s2 = ‘abc’;

In the example above, s1 will become a string of size 123, not a string that contains the characters “123”. The second example contains single quotation marks instead of double quotes (which may happen by accident) which will also result in calling of the first constructor and creating a string with a very big size. These are really simple examples, and there are many more complicated cases that lead to confusion and unpredicted conversions that are very hard to find. There are 2 general rules of how to avoid such problems:

Define a constructor with explicit keyword to disallow implicit conversions.
Instead of using conversion operators, use explicit conversation methods. It requires a little bit more typing, but it is much cleaner to read and can help avoid unpredictable results.

Conclusion

C++ is a powerful language. In fact, many of the applications that you use every day on your computer and have come to love are probably built using C++. As a language, C++ gives a tremendous amount of flexibility to the developer, through some of the most sophisticated features seen in object-oriented programming languages. However, these sophisticated features or flexibilities can often become the cause of confusion and frustration for many developers if not used responsibly. Hopefully this list will help you understand how some of these common mistakes influence what you can achieve with C++.

“This article was written by Vatroslav Bodrozic , a Toptal developer.”

Posted in C++, C++11, Programming

1 Comment

Break…?

Aug 13

Posted by Khuram Ali

This is a replica of the code that caused a major disruption of AT&T phone service throughout the U.S. AT&T’s network was in large part unusable for about nine hours starting on the afternoon of January 15, 1990. Telephone exchanges are all computer systems these days, and this code was running on a model 4ESS Central Office Switching System.

It demonstrates that it is too easy in C to overlook exactly which control constructs are affected by a “break” statement.

network code()

{

switch (line) {

case THING1:

doit1();

break;

case THING2:

if (x == STUFF) {

do_first_stuff();

if (y == OTHER_STUFF)

break;

do_later_stuff();

} /* coder meant to break to here… */

initialize_modes_pointer();

break;

default:

processing();

} /* …but actually broke to here! */

use_modes_pointer();/* leaving the modes_pointer

uninitialized */

}

This is a simplified version of the code, but the bug was real enough. The programmer wanted to break out of the “if” statement, forgetting that “break” actually gets you out of the nearest enclosing iteration or switch statement. Here, it broke out of the switch, and executed the call to use_modes_pointer() —but the necessary initialization had not been done, causing a failure further on.

This code eventually caused the first major network problem in AT&T’s 114-year history. The saga is described in greater detail on page 11 of the January 22, 1990 issue of Telephony magazine. The supposedly fail-safe design of the network signaling system

actually spread the fault in a chain reaction, bringing down the entire long distance network.

And it all rested on a C switch statement.

Posted in C++, Programming

Leave a comment

Tags: Break;, disruption of AT&T phone service, first major network problem in AT&T, Switch statment

Const Keyword in C and C++ (A few interesting points)

Aug 7

Posted by Khuram Ali

Consider following code snippet. Does it even compile?

foo (const char **conChar) {}

main (int argc, char **argv)

{

foo (argv);

}

This code will not compile with an error message of incompatible type. This is true in both c and c++. (don’t forget to add return type to foo in c++)

Now consider below snippet,

char * cp ;

const char * ccp;

ccp = cp;

The left operand is a pointer to “char qualified by const”.
The right operand is a pointer to “char” unqualified.
The type char is a compatible type with char, and the type pointed to by the left operand has all the qualifiers of the type pointed to by the right operand (none), plus one of its own (const).

(Note that the assignment cannot be made the other way around. Try it if you don’t believe me.

cp = ccp; /* results in a compilation warning */)

here apparently we are doing the same thing while assigning a non const qualified pointer to to a const qualified pointer but it run without an error for both c and c++.

So why the first code is not compiling and second one does?

const char ** denotes a pointer to an unqualified type. Its type is a pointer to a pointer to a qualified type.

Since the types char ** and const char ** are both pointers to unqualified types that are not the same type, they are not compatible types. Therefore, a call with an argument of type char ** corresponding to a parameter of type const char ** is not allowed.

Now what will happen, if I don’t use * or **….?

Consider below code,

int cp ;

const int ccp;

ccp = cp;

You will have compile error in both c and c++ compilers… as you cannot assign a new value to a variable declared const…

Is it really so?

Consider following C code,

int cp = 12;

const int ccp = 10;

int * alter;

alter = &ccp;

*alter = cp;

printf(“value is %d”, ccp);

We can always take address of a const variable in C and can assign it to another pointer variable, which is not a constant pointer. Now we have the access of memory location of the constant variable and can change the value using our pointer.

This is not true in C++ as it prohibits, any kind of assignment to a constant variable.

Posted in C++

1 Comment

Tags: Const keyword in C and C++

C Coding Standards: Throw by Value, Catch by Reference | Summary | InformIT

Jun 28

Posted by Khuram Ali

C Coding Standards: Throw by Value, Catch by Reference | Summary | InformIT.

Posted in C++

Leave a comment

Tags: Andrei Alexandrescu, C++ Coding Standards, Catch by Reference, Herb Sutter, Throw by Value

Using Function Pointers for Callbacks in C++

Jun 21

Posted by Khuram Ali

You plan to call some function func1, and at runtime you need it to invoke another function func2. For one reason or another, however, you cannot simply hardcode the name of func2 within func1. func2 may not be known definitively at compile time, or perhaps func1 belongs to a third-party API that you can’t change and recompile. In either case, you need a callback function.

In a situation such as that shown in below code, a function pointer is a good idea if updateProgress and longOperation shouldn’t knowanything about each other. For example, a function that updates the progress by displaying it to the user—either in a user interface (UI) dialog box, in a console window, or somewhere else—does not care about the context in which it is invoked. Similarly, the longOperation function may be part of some data loading API that doesn’t care whether it’s invoked from a graphical UI, a console window, or by a background process.

The first thing you will want to do is determine what the signature of the function is you plan to call and create a typedef for it. typedef is your friend when it comes to function pointers, because their syntax is ugly. Consider howyou would declare a function pointer variable f that contains the address of a function that takes a single integer argument and returns a boolean. It would look like this:
bool (*f)(int); // f is the variable name
One could argue, convincingly, that this is no big deal and that I’m just a whiner. But what if you want a vector of such function pointers?
vector<bool (*)(int)> vf;
Or an array of them?
bool (*af[10])(int);
Function pointers do not look like ordinary C++ variable declarations whose format is often a (qualified) type name followed by a variable name. This is why they can make for messy reading.
Thus, in below code, I used a typedef like this:
typedef bool (*FuncPtrBoolInt)(int);

Once that was out of the way, I was free to declare function pointers that have the signature of returning bool and accepting a single integer argument as I would any other sort of parameter, like so:

void longOperation(FuncPtrBoolInt f) {
// …
Now, all longOperation needs to do is call f like it would any function:
f (l/1000000);
In this way, f can be any function that accepts an integer argument and returns bool. Consider a caller of longOperation that doesn’t care about the progress. It can pass in a function pointer of a no-op function:
bool whoCares(int i) {return(true);}
//…
longOperation(whoCares);
More importantly, which function to pass to longOperation can be determined dynamically at runtime.


#include <iostream>
// An example of a callback function
bool updateProgress(int pct)

{
std::cout << pct << "% complete...\n";
return(true);
}

// A typedef to make for easier reading
typedef bool (*FuncPtrBoolInt)(int);
// A function that runs for a while
void longOperation(FuncPtrBoolInt f)

{
for (long l = 0; l < 100000000; l++)
if (l % 10000000 == 0)
f(l / 1000000);
}
int main( )

{
longOperation(updateProgress); // ok
}

Typedef vs. Non-typedef function pointers. (daniweb.com)
With C + + to control DVDCD drive switch (c2013forprogrammers84006.wordpress.com)
Representing Large Fixed-Width Integers in C++ (alikhuram.wordpress.com)
C++ Parsing (chrissvisser.wordpress.com)
CC + + language pointer to a function (chowtoprogramdeitel85128.wordpress.com)
Function Pointer (introducingsandipan.wordpress.com)
Why are variables inside functions visible to callback functions declared inside that function? (stackoverflow.com)
callback functions (meetharshad.wordpress.com)

Posted in C++

1 Comment

Tags: Application programming interface, Boolean data type, Callback, Compiler, Declaration (computer programming), function pointer, Integers, Typedef, Using Function Pointers for Callbacks in C++

Parsing a Simple XML Document using C++

Jun 17

Posted by Khuram Ali

TinyXml is an excellent choice for applications that need to do just a bit of XML processing. Its source distribution is small, it’s easy to build and integrate with projects, and it has a very simple interface. It also has a very permissive license. Its main limitations are that it doesn’t understand XML Namespaces, can’t validate against a DTD or schema, and can’t parse XML documents containing an internal DTD. If you need to use any of these features, or any of the XML-related technologies such as XPath or XSLT, you should use the other libraries.
The TinyXml parser produces a representation of an XML document as a tree whose nodes represent the elements, text, comments and other components of an XML document. The root of the tree represents the XML document itself. This type of representation
of a hierarchical document as a tree is known as a Document Object Model (DOM). The TinyXml DOM is similar to the one designed by the World Wide Web Consortium (W3C), although it does not conform to the W3C specification.
In keeping with the minimalist spirit of TinyXml, the TinyXml DOM is simpler than the W3C DOM, but also less powerful. The nodes in the tree representing an XML document can be accessed through the interface TiXmlNode, which provides methods to access a node’s parent, to enumerate its child nodes, and to remove child nodes or insert additional child nodes. Each node is actually an instance of a more derived type; for example, the root of the tree is an instance of TiXmlDocument, nodes representing elements are instances TiXmlElement, and nodes representing text are instances of TiXmlText. The type of a TiXmlNode can be determined by calling its Type( ) method; once you knowthe type of a node, you can obtain a representation of the node as a more derived type by calling one of the convenience methods such as toDocument( ), toElement( ) and toText( ).

These derived types contain additional methods appropriate to the type of node they represent. It’s noweasy to understand example code. First, the function textValue( ) extracts the text content from an element that contains only text, such as name, species, or dateOfBirth. It does this by first checking that an element has only one child, and that the child is a text node. It then obtains the child’s text by calling the Value( ) method, which returns the textual content of a text node or comment node, the tag name of an element node, and the filename of a root node.
Next, the function nodeToContact( ) takes a node corresponding to a veterinarian or trainer element and constructs a Contact object from the values of its name and phone attributes, which it retrieves using the Attribute( ) method. Similarly, the function nodeToAnimal( ) takes a node corresponding to an animal element and constructs an Animal object. It does this by iterating over the node’s children using the NextSiblingElement( ) method, extracting the data contained in each element, and setting the corresponding property of the Animal object. The data is extracted using the function textValue( ) for the elements name, species, and dateOfBirth and the function nodeToContact( ) for the elements veterinarian and trainer.

In the main function, I first construct a TiXmlDocument object corresponding to the file animals.xml and parse it using the LoadFile( ) method. I then obtain a TiXmlElement corresponding to the document root by calling the RootElement( ) method. Next, I iterate over the children of the root element, constructing an Animal object from each animal element using the function nodeToAnimal( ). Finally, I iterate over the collection of Animal objects, writing them to standard output. One feature of TinyXml that is not illustrated in code expamle is the SaveFile( ) method of TiXmlDocument, which writes the document represented by a TiXmlDocument
to a file. This allows you to parse an XML document, modify it using the DOM interface, and save the modified document. You can even create a TiXmlDocument from scratch and save it to disk:
// Create a document hello.xml, consisting
// of a single “hello” element
TiXmlDocument doc;
TiXmlElement root(“hello”);
doc.InsertEndChild(root);
doc.SaveFile(“hello.xml”);

Use the TinyXml library. First, define an object of type TiXmlDocument and call its LoadFile( ) method, passing the pathname of your XML document as its argument. If LoadFile( ) returns true, your document has been successfully parsed. If parsing was successful, call the RootElement( ) method to obtain a pointer to an object of type TiXmlElement representing the document root. This object has a hierarchical structure that reflects the structure of your XML document; by traversing this structure, you can extract information about the document and use this information to create a collection of C++ objects.
For example, suppose you have an XML document animals.xml representing a collection of circus animals, as shown in first sample code. The document root is named animalList and has a number of child animal elements each representing an animal owned by the Feldman Family Circus. Suppose you also have a C++ class named Animal, and you want to construct a std::vector of Animals corresponding to the animals listed in the document.


<?xml version="1.0" encoding="UTF-8"?>
<!-- Feldman Family Circus Animals -->
<animalList>
<animal>
<name>Herby</name>
<species>elephant</species>
<dateOfBirth>1992-04-23
<veterinarian name="Dr. Hal Brown" phone="(801)595-9627"/>
<trainer name="Bob Fisk" phone="(801)881-2260"/>
</animal>
<animal>
<name>Sheldon</name>
<species>parrot</species>
<dateOfBirth>1998-09-30</dateOfBirth>
<veterinarian name="Dr. Kevin Wilson" phone="(801)466-6498"/>
<trainer name="Eli Wendel" phone="(801)929-2506"/>
</animal>
<animal>
<name>Dippy</name>
<species>penguin</species>
<dateOfBirth>2001-06-08</dateOfBirth>
<veterinarian name="Dr. Barbara Swayne" phone="(801)459-7746"/>
<trainer name="Ben Waxman" phone="(801)882-3549"/>
</animal>
<!--<span class="hiddenSpellError" pre=""-->animalList>

Code sample shows how the definition of the class Animal might look. Animal has five data members corresponding to an animal’s name, species, date of birth, veterinarian, and trainer. An animal’s name and species are represented as std::strings,
its date of birth is represented as a boost::gregorian::date from Boost.Date_Time, and its veterinarian and trainer are represented as instances of the class Contact, also defined in both code examples shows how to use TinyXml to parse the document animals.xml, traverse the parsed document, and populate a std::vector of Animals using data extracted from the document.

The header animal.hpp:


#ifndef ANIMALS_HPP_INCLUDED
#define ANIMALS_HPP_INCLUDED
#include <ostream>
#include <string>
#include <stdexcept> // runtime_error
#include gregorian/gregorian.hpp>
#include <boost/regex.hpp>
// Represents a veterinarian or trainer
class Contact

{
public:

Contact( ) { }
Contact(const std::string& name, const std::string& phone)
: name_(name)
{
setPhone(phone);
}
std::string name( ) const { return name_; }
std::string phone( ) const { return phone_; }
void setName(const std::string& name) { name_ = name; }
void setPhone(const std::string& phone)
{
using namespace std;
using namespace boost;
// Use Boost.Regex to verify that phone
// has the form (ddd)ddd-dddd
static regex pattern("\\([0-9]{3}\\)[0-9]{3}-[0-9]{4}");
if (!regex_match(phone, pattern)) {
throw runtime_error(string("bad phone number:") + phone);
}
phone_ = phone;
}
private:
std::string name_;
std::string phone_;
};

// (for completeness, you should also define operator!=)
bool operator==(const Contact& lhs, const Contact& rhs)
{
return lhs.name() == rhs.name() && lhs.phone() == rhs.phone();
}
// Writes a Contact to an ostream
std::ostream& operator<<(std::ostream& out, const Contact& contact)
{
out << contact.name( ) << " " << contact.phone( );
return out;
}
// Represents an animal
class Animal

 {
public:
// Default constructs an Animal; this is
// the constructor you'll use most
Animal( ) { }
// Constructs an Animal with the given properties;
Animal( const std::string& name,
const std::string& species,
const std::string& dob,
const Contact& vet,
const Contact& trainer )
: name_(name),
species_(species),
vet_(vet),
trainer_(trainer)
{
setDateOfBirth(dob);
}
// Getters
std::string name( ) const { return name_; }
std::string species( ) const { return species_; }
boost::gregorian::date dateOfBirth( ) const { return dob_; }
Contact veterinarian( ) const { return vet_; }
Contact trainer( ) const { return trainer_; }
// Setters
void setName(const std::string& name) { name_ = name; }
void setSpecies(const std::string& species) { species_ = species; }
void setDateOfBirth(const std::string& dob)
{
dob_ = boost::gregorian::from_string(dob);
}
void setVeterinarian(const Contact& vet) { vet_ = vet; }
void setTrainer(const Contact& trainer) { trainer_ = trainer; }
private:
std::string name_;
std::string species_;
boost::gregorian::date dob_;
Contact vet_;
Contact trainer_;
};

// (for completeness, you should also define operator!=)
bool operator==(const Animal& lhs, const Animal& rhs)
{
return lhs.name() == rhs.name() &&
lhs.species() == rhs.species() &&
lhs.dateOfBirth() == rhs.dateOfBirth() &&
lhs.veterinarian() == rhs.veterinarian() &&
lhs.trainer() == rhs.trainer();
}
// Writes an Animal to an ostream
std::ostream& operator<<(std::ostream& out, const Animal& animal)
{
out << "Animal {\n"
<< " name=" << animal.name( ) << ";\n"
<< " species=" << animal.species( ) << ";\n"
<< " date-of-birth=" << animal.dateOfBirth( ) << ";\n"

<< " veterinarian=" << animal.veterinarian( ) << ";\n"
<< " trainer=" << animal.trainer( ) << ";\n"
<< "}";
return out;
}
#endif // #ifndef ANIMALS_HPP_INCLUDED

Parsing animals.xml with TinyXml:

#include <exception>
#include  // cout
#include <stdexcept> // runtime_error
#include <cstdlib> // EXIT_FAILURE
#include  // strcmp
#include <vector>
#include <tinyxml.h>
#include "animal.hpp"
using namespace std;
// Extracts the content of an XML element that contains only text
const char* textValue(TiXmlElement* e)
{
TiXmlNode* first = e->FirstChild( );
if ( first != 0 &&
first == e->LastChild( ) &&
first->Type( ) == TiXmlNode::TEXT )
{
// the element e has a single child, of type TEXT;
// return the child's
return first->Value( );
} else {
throw runtime_error(string("bad ") + e->Value( ) + " element");
}
}
// Constructs a Contact from a "veterinarian" or "trainer" element
Contact nodeToContact(TiXmlElement* contact)
{
using namespace std;
const char *name, *phone;
if ( contact->FirstChild( ) == 0 &&
(name = contact->Attribute("name")) &&
(phone = contact->Attribute("phone")) )
{
// The element contact is childless and has "name"
// and "phone" attributes; use these values to
// construct a Contact
return Contact(name, phone);
} else {
throw runtime_error(string("bad ") + contact->Value( ) + " element");

}
}
// Constructs an Animal from an "animal" element
Animal nodeToAnimal(TiXmlElement* animal)
{
using namespace std;
// Verify that animal corresponds to an "animal" element
if (strcmp(animal->Value( ), "animal") != 0) {
throw runtime_error(string("bad animal: ") + animal ->Value( ));
}
Animal result; // Return value
TiXmlElement* element = animal->FirstChildElement( );
// Read name
if (element && strcmp(element->Value( ), "name") == 0) {
// The first child element of animal is a "name"
// element; use its text value to set the name of result
result.setName(textValue(element));
} else {
throw runtime_error("no name attribute");
}
// Read species
element = element->NextSiblingElement( );
if (element && strcmp(element->Value( ), "species") == 0) {
// The second child element of animal is a "species"
// element; use its text value to set the species of result
result.setSpecies(textValue(element));
} else {
throw runtime_error("no species attribute");
}
// Read date of birth
element = element->NextSiblingElement( );
if (element && strcmp(element->Value( ), "dateOfBirth") == 0) {
// The third child element of animal is a "dateOfBirth"
// element; use its text value to set the date of birth
// of result
result.setDateOfBirth(textValue(element));
} else {
throw runtime_error("no dateOfBirth attribute");
}
// Read veterinarian
element = element->NextSiblingElement( );
if (strcmp(element->Value( ), "veterinarian") == 0) {
// The fourth child element of animal is a "veterinarian"
// element; use it to construct a Contact object and
// set result's veterinarian

result.setVeterinarian(nodeToContact(element));
} else {
throw runtime_error("no veterinarian attribute");
}
// Read trainer
element = element->NextSiblingElement( );
if (strcmp(element->Value( ), "trainer") == 0) {
// The fifth child element of animal is a "trainer"
// element; use it to construct a Contact object and
// set result's trainer
result.setTrainer(nodeToContact(element));
} else {
throw runtime_error("no trainer attribute");
}
// Check that there are no more children
element = element->NextSiblingElement( );
if (element != 0) {
throw runtime_error(
string("unexpected element:") +
element->Value( )
);
}
return result;
}

Main Program:


int main( )
{
using namespace std;
try {
vector animalList;
// Parse "animals.xml"
TiXmlDocument doc("animals.xml");
if (!doc.LoadFile( ))
throw runtime_error("bad parse");
// Verify that root is an animal-list
TiXmlElement* root = doc.RootElement( );
if (strcmp(root->Value( ), "animalList") != 0) {
throw runtime_error(string("bad root: ") + root->Value( ));
}
// Traverse children of root, populating the list
// of animals
for ( TiXmlElement* animal = root->FirstChildElement( );
animal;
animal = animal->NextSiblingElement( ) )

{
animalList.push_back(nodeToAnimal(animal));
}
// Print the animals' names
for ( vector::size_type i = 0,
n = animalList.size( );
i < n;
++i )
{
cout << animalList[i] << "\n";
}
} catch (const exception& e) {
cout << e.what( ) << "\n";
return EXIT_FAILURE;
}
}

XML Processing with XmlPullParser (androidstudies.wordpress.com)
Parsing XML in Groovy using XmlSlurper (java.dzone.com)
Validating XML documents with XSD correctly (stackoverflow.com)
xmlutils 0.91 (pypi.python.org)
Converting XElement to XmlElement for XSLT transformation ASP.NET C# (stackoverflow.com)
Oxygen XML Editor 14.0 (x86/x64) (fagivece.wordpress.com)

Posted in C++

Leave a comment

Tags: C++, Document Object Model, Document Type Definition, Parsing a Simple XML Document using C++, Root element, TinyXml, W3C DOM, World Wide Web Consortium, XML, XPath

Posted on June 12, 2013 | Link

Creating a Thread using C++ Boost Lib.

May 29

Posted by Khuram Ali

Creating a thread is deceptively simple. All you have to do is create a thread object on the stack or the heap, and pass it a functor that tells it where it can begin working. For this discussion, a “thread” is actually two things. First, it’s an object of the class thread, which is a C++ object in the conventional sense. When I am referring to this object, I will say “thread object.” Then there is the thread of execution, which is an operating system thread that is represented by the thread object. When I say “thread” , I mean the operating system thread. Let’s get right to the code in the example. The thread constructor takes a functor (or function pointer) that takes no arguments and returns void. Look at this line from below code,

boost::thread myThread(threadFun);
This creates the myThread object on the stack, which represents a new operating system thread that begins executing threadFun. At that point, the code in threadFun and the code in main are, at least in theory, running in parallel. They may not exactly be running in parallel, of course, because your machine may have only one processor, in which case this is impossible (recent processor architectures have made this not quite true, but I’ll ignore dual-core or above processors and the like for now). If you have only one processor, then the operating system will give each thread you create a slice of time in the run state before it is suspended. Because these slices of time can be of varying sizes, you can never be guaranteed which thread will reach a particular point first.
This is the aspect of multithreaded programming that makes it difficult: multithreaded program state is nondeterministic. The same multithreaded program, run multiple times, with the same inputs, can produce different output. Coordinating resources used by multiple threads is the subject of my next post.
After creating myThread, the main thread continues, at least for a moment, until it reaches the next line:
boost::thread::yield( );
This puts the current thread (in this case the main thread) in a sleep state, which means the operating system will switch to another thread or another process using some operating-system-specific policy. yield is a way of telling the operating system that the current thread wants to give up the rest of its slice of time. Meanwhile, the newthread is executing threadFun. When threadFun is done, the child thread goes away. Note that the thread object doesn’t go away, because it’s still a C++ object that’s in scope. This is an important distinction.
The thread object is something that exists on the heap or the stack, and works just like any other C++ object. When the calling code exits scope, any stack thread objects are destroyed and, alternatively, when the caller calls delete on a thread*, the corresponding heap thread object disappears. But thread objects are just proxies for the actual operating system threads, and when they are destroyed the operating system threads aren’t guaranteed to go away. They merely become detached, meaning that they cannot later be rejoined. This is not a bad thing. Threads use resources, and in any (well-designed) multithreaded application, access to such resources (objects, sockets, files, rawmemory, and so on) is controlled with mutexes, which are objects used for serializing access to something among multiple threads.

If an operating system thread is killed, it will not release its locks or deallocate its resources, similarly to how killing a process does not give it a chance to flush its buffers or release operating system resources properly. Simply ending a thread when you think it ought to be finished is like pulling a ladder out from under a painter when his time is up.
Thus, we have the join member function. As in below code, you can call join to wait for a child thread to finish. join is a polite way of telling the thread that you are going to wait until it’s done working:
myThread.join( );
The thread that calls join goes into a wait state until the thread represented by myThread is finished. If it never finishes, join never returns. join is the best way to wait for a child thread to finish. You may notice that if you put something meaningful in threadFun, but comment out the use of join, the thread doesn’t finish its work. Try this out by putting a loop or some long operation in threadFun. This is because when the operating system destroys a process, all of its child threads go with it, whether they’re done or not. Without the call to join, main doesn’t wait for its child thread: it exits, and the operating system thread is destroyed.

If you need to create several threads, consider grouping them with a thread_group object. A thread_group object can manage threads in a couple of ways. First, you can call add_thread with a pointer to a thread object, and that object will be added to the group. Here’s a sample:
boost::thread_group grp;
boost::thread* p = new boost::thread(threadFun);
grp.add_thread(p);
// do something…
grp.remove_thread(p)

When grp’s destructor is called, it will delete each of the thread pointers that were added with add_thread. For this reason, you can only add pointers to heap thread objects to a thread_group. Remove a thread by calling remove_thread and passing in the thread object’s address (remove_thread finds the corresponding thread object in the group by comparing the pointer values, not by comparing the objects they point to). remove_thread will remove the pointer to that thread from the group, but you are still responsible for delete-ing it. You can also add a thread to a group without having to create it yourself by calling create_thread, which (like a thread object) takes a functor as an argument and begins executing it in a new operating system thread. For example, to spawn two threads in a group, do this:
boost::thread_group grp;
grp.create_thread(threadFun);
grp.create_thread(threadFun); // Now there are two threads in grp
grp.join_all( ); // Wait for all threads to finish
Whether you add threads to the group with create_thread or add_thread, you can call join_all to wait for all of the threads in the group to complete. Calling join_all is the same as calling join on each of the threads in the group: when all of the threads in the group have completed their work join_all returns.
Creating a thread object allows a separate thread of execution to begin. Doing it with the Boost Threads library is deceptively easy, though, so design carefully.

#include <iostream>
#include <boost/thread/thread.hpp>
#include <boost/thread/xtime.hpp>
struct MyThreadFunc

{
void operator( )( )

{
// Do something long-running...
}
} threadFun;
int main( )

{
boost::thread myThread(threadFun); // Create a thread that starts
// running threadFun

boost::thread::yield( ); // Give up the main thread's timeslice
// so the child thread can get some work
// done.
// Go do some other work...
myThread.join( ); // The current (i.e., main) thread will wait
// for myThread to finish before it returns

}

SINGLETON in Multithreaded Environment (aishwaryavaishno.wordpress.com)
Synchronising Multithreaded Integration Tests revisited (java.dzone.com)
Working With Threads (androidstudies.wordpress.com)
Adventures in Parallel Universe Part 1 (lastsector.wordpress.com)

Posted in C++, Multithreading

Leave a comment

Tags: boost thread lib, C++, function pointer, functor, multithreaded programming in C++, Threads

Write Random Numbers to a File and Read Into Vector in C++

May 23

Posted by Khuram Ali

Graham's Code

// Programmer: Graham Nedelka
// Output random numbers to file, read in via vector
//
#include <iostream>
#include <fstream>
#include <vector>
#include <time.h>
using namespace std;
int main () {
srand (time(NULL));
ofstream myFile(“data.dat”);
for (int i = 0; i < 1000000; i++)
myFile << rand() % 10000 + 1 << endl;
vector<int> newVector;
ifstream myRead;
myRead.open(“data.dat”);
int x;
while (!myRead.eof())
{
myRead >> x;
newVector.push_back(x);
}
myRead.close();
for (int i = 0; i < newVector.size(); i++)
cout << newVector[i] << endl;
sort(newVector.begin(), newVector.end());
for (int i = 0; i < newVector.size(); i++)
cout << newVector[i] << endl;
return 0;
}

View original post

Posted in C++

Leave a comment

Khuram Ali

Code is poetry.

Category Archives: C++

Comparing modern C++ and Rust in terms of safety and performance

Top 10 Most Common C++ Mistakes That Developers Make

Common Mistake #1: Using “new” and ”delete” Pairs Incorrectly

Common Mistake #2: Forgotten Virtual Destructor

Common Mistake #3: Deleting an Array With “delete” or Using a Smart Pointer

Common Mistake #4: Returning a Local Object by Reference

Common Mistake #5: Using a Reference to a Deleted Resource

Common Mistake #6: Allowing Exceptions to Leave Destructors

Common Mistake #7: Using “auto_ptr” (Incorrectly)

Common Mistake #8: Using Invalidated Iterators and References

Common Mistake #9: Passing an Object by Value

Common Mistake #10: Using User Defined Conversions by Constructor and Conversion Operators

Conclusion

Break…?

Const Keyword in C and C++ (A few interesting points)

C Coding Standards: Throw by Value, Catch by Reference | Summary | InformIT

Using Function Pointers for Callbacks in C++

Parsing a Simple XML Document using C++

Creating a Thread using C++ Boost Lib.

Write Random Numbers to a File and Read Into Vector in C++

Recent Posts

Top Posts & Pages

Categories

Archives

Follow Blog via Email

Khuram Ali

Meta

Email:

Blog Stats

Khuram Ali

Code is poetry.

Category Archives: C++

Common Mistake #1: Using “new” and ”delete” Pairs Incorrectly

Common Mistake #2: Forgotten Virtual Destructor

Common Mistake #3: Deleting an Array With “delete” or Using a Smart Pointer

Common Mistake #4: Returning a Local Object by Reference

Common Mistake #5: Using a Reference to a Deleted Resource

Common Mistake #6: Allowing Exceptions to Leave Destructors

Common Mistake #7: Using “auto_ptr” (Incorrectly)

Common Mistake #8: Using Invalidated Iterators and References

Common Mistake #9: Passing an Object by Value

Common Mistake #10: Using User Defined Conversions by Constructor and Conversion Operators

Conclusion

Related articles

Related articles

Related articles

Recent Posts

Top Posts & Pages

Categories

Archives

Follow Blog via Email

Meta

Email:

Blog Stats