Shallow/Deep copy by designFiled Under: Weekly Tuesday Dose of goodness
- Shallow copy? Deep copy? Will I drown?
- Shallow copy? Deep copy? Will I drown? - Part 2
- Shallow/Deep copy by design
Hi all,
I know I’ve spoken about shallow and deep copy mechanisms in C++ before. However, I’ll do so in more details today.
Just for your information, this topic belongs to the C++ Advanced Course Training as part of the syllabus.
We all roughly know what’s shallow and deep copy. But do we know when to use shallow and deep?
Let’s find out…
Introduction
Let’s take a few steps back and review the big 4 together shall we?
1) Default Constructor
2) Copy Constructor
3) Copy Assignment Operator
4) Destructor
For copies, in a direct sense, only the middle 2 facilities will be involved. However, for destructors, they too sometimes play a part in the shallow-copy implementation. (By knowing what to destroy)
A rule in C++ states that All classes with dynamic allocations must implement the copy constructor and the canonical copy assignment operator.
Just sidetrack a little, canonical basically means the authentic version of a copy assignment operator. That means to say, you’ll need to have something like this:
class myClass {
public:
const myClass& operator = (const myClass& copy);
};
Anything less or different, such as a different return value, parameter not a constant will all result in a non-canonical variant.
Revisit to shallow/deep copy mechanism
We’ll revisit what is it meant by shallow and deep copy.
There’s a crucial component here that is required for copy mechanisms to be differentiated. That is pointers. A reference may be an alias to an object but it doesn’t have its own form.
Unlike pointers, which can also be regarded as unsigned integers, they themselves have a location in the memory (stack or heap), then hold a memory location, which is 4 bytes for 32-bits architecture, and 8-bytes for 64-bits.
The memory location they hold is alike an indexed located in the heap memory.
Thus, the notion of a shallow copy refers to as copying only the memory location from one pointer to another.
The notion of a deep copy will therefore refer to as copying the entire contents of an object from one to another.
Why is there a need for shallow copy?
Another way of asking this question would be - isn’t deep copy sufficient and can resolve all the sticky issues with pointers?
The answer is - no, we’ll still need shallow copy. Why? The basic answer is performance. Performance in terms of memory consumption and copy operations.
A shallow copy merely copies the memory location over, 4 bytes or 8 bytes maximum. However, if you copy an object over, with struct aligned to 4 bytes, every attribute in the object/structure will take at least 4 bytes or more to copy over.
Without shallow copy, another problem might arise. That is, if a method is not well designed to handle heavy stack operations, without shallow copy, the stack will run out very easily due to the massive amount of memory it requires to hold the temporary objects.
Thus the rule in C++, always pass objects by reference and not by value.
In very low level, references and pointers are somewhat analogous, but not exactly the same. References does NOT perform a shallow copy; it’s just an alias.
P.S. Try declaring a reference and compile, your compiler will surely hit errors.
The common characteristic between references and pointers are, they both allow referential integrity to take place. That means, the ability to modify an object out of its scope.
When should I use shallow copy and deep copy?
This really depends very much on the design your application.
The notion comes from the attributes being a shared attribute cross multiple objects or has certain very unique object lifespan.
Why can I use static? The reason is simple. A static attribute is always one and only in C++. However, you may need to have object groups instead of a single controlling attribute.
This is exemplified in the design of a reference counting pointer. All instances of the reference counting pointer will at least need to point to a common reference counter pointer and the object pointer itself.
This cannot be accomplished simply by declaring both as static attributes. If I do so, I’ll only be able to keep track of 1 object per type assuming that my reference counting pointer container is a template. Much alike cSmartPtr<T>.
Thus there’s a need to maintain separate object groups pointing at the same data. This is accomplished by designing shallow copy meticulously. A single mistake however will cause things to break apart immediately.
How can I control what to shallow/deep copy?
To do so, you’ll need to first understand the data flow and state of persistence. In UML, you’ll probably need your sequence and state chart diagrams to help you out.
If not, you can at least make use of your use cases to make some sense out of the data flow.
Once done, you must channel these data correctly.
The following will affect how data manipulates your class:
1) Operator overrides (ie, canonical assignment operator and the other operators)
2) Copy Constructor
3) Default and Parameterized Constructors
4) Destructor
These are the possible locations where the data in your class can be modified or affected.
For 1) and 2), they determine how your class is mutated or copied.
For 3), they determine how your class is initialized.
For 4), they determine which attribute is modified or deleted during its own destruction.
Remember, an object can be sharing a common pointer for certain reasons. Thus it may not necessarily delete the pointer attribute in its destructor if the design doesn’t permit.
The same applies to the rest during copy and initialization operations. There might be things that need to be brand new, some things need to be shared, some things need to be modified as well as some needs to be deleted prior.
Conclusion
There’s no extremity in shallow and deep copy. That means, even with Big 4 (copy assignment operator, constructor, destructor, copy constructor), it doesn’t necessarily mean that everything must be done the deep copy way.
The whole point about allowing overrides these operations is to allow a developer to clearly indicate what is to be copied via shallow or deep means.
This indication of course derives from the understanding of the technical requirements which is in turn derived from the project requirements.
Design is key to how shallow and deep copy should be handled.
Hopefully, this will clear up any possible misunderstandings due to my previous posts on shallow and deep copy.
Have a great week ahead!
Signing off,
Jeremy
- Permalink
- Admin
- 27 Apr 2010 10:41 AM
- Comments (0)