Omgili, forum search, forums search, search forums, discussion search,discussions search, search discussions, board search, boards search, search boards
  Advanced Search

Opinion Wanted - How to Expose a Collection

On Sun, 24 May 2009 12:23:31 -0700 (PDT), "jeh...@gmail.com" <...@gmail.com

Hello:

The "gurus" out there suggests being very careful about how your
expose collections in your interface. If you do not intend the users
of your class to alter a collection, you must make sure that they
can't.

Here are some common implementations for returning a collection and
some of their pros and cons. Take a look and tell me what you
typically do and which one you think is the best practice.

1) Return the collection directly - Directly expose the collection in
the interface. If the collection is created in the property or method,
altering the collection won't likely affect the class. If the
collection is a member of the class, changes to it could invalidate
the state of the class. Future versions of the interface will be
required to expose the functionality provided by the collection (to be
backward compatible).

2) Return a copy of the collection - Expose a copy of the collection.
The copy would provide all of the same functionality as the
collection, but changes to it couldn't affect the state of the class.
This could lead to a lot of overhead if the collections are large or
complex. Your interface will be forced to expose the collection in
future versions.

3) Return a read-only wrapper around the collection - Expose a read-
only collection. This prevents modification. It has minimal overhead.
However, modifications to the collection may cause runtime errors.
You're still exposing it in your interface.

4) Return the collection through an interface - Expose the collection,
but through one of its interfaces. This would be like exposing List<Tas IEnumerable<Tyou expose List<T
5) Implement a custom collection - Expose the collection, but through
a custom collection with a constrained interface. This can eliminate
generics. The collection could inherit Collection<T
This is under the assumption that I actually want to expose a
collection. I'm not too concerned with the logistics behind
determining when to expose a collection, I just want to know how.

My approach, that has been evolving for a while now, has been to
return the collection through a restricted interface. I just have a
policy that says I can't cast from IEnumerable<Twant to work with a List<Tmembers via the ctor or the AddRange method.

I realize that this has costs associated with it. Generally, it has
not been an issue.

There are times where I won't even follow this policy. Some times I
don't think exposing a List<Tcollection is part of the class's state, I will create a copy to
mitigate a cast being performed. It kind of bothers me that I switch
from time to time.

I'm more curious what strategies others have come to practice on a
day-to-day basis. I'd like to hear as a community what seems to be the
trend.



On Sun, 24 May 2009 14:06:43 -0700, "Peter Duniho" <...@nnowslpianmk.com

On Sun, 24 May 2009 12:23:31 -0700, jehu...@gmail.com
<...@gmail.com

Seems to me that the above statement says it all. It's all about the
requirements. If you have a requirement that the users of your class not
alter a collection, either by modifying the collection instance or by
providing a different instance, then you need to enforce that somehow.
But that's not always the requirement.

There is no "best practice". It depends entirely on the context.

This approach (not counting the unrelated comment above regarding "if the
collection is created in the property or method"...clearly that's an
entirely different situation and doesn't really belong in this paragraph)
doesn't meet a requirement that states the collection cannot be changed.

So obviously you'd only use it in a scenario where it makes sense for the
collection to be modified by the client code.

Note that there are two variations on the theme: a read-only property,
where the collection itself may be modified, but where the client cannot
replace the collection instance itself; and a read/write property where
the collection as well as the specific instance used may be modified.

You'll find various examples of each in .NET, all used in situations where
the requirement isn't that the client may not change the collection.

This overhead is required if you need "absolute" safety. The only way to
guarantee that the client code can't modify your collection is to not let
it see it in the first place. Note that even in that case, nothing
precludes the client code from using reflection to discover your
collection and violate whatever contract you've stipulated. But at least
if they do that, you know they didn't get it from the instance you
returned from the property. :)

This is my preference when I want to return a read-only collection
implementing the IList<Tof ReadOnlyCollection<Tmisuse are delayed until run-time, but when you're dealing with the
built-in .NET interfaces, you don't have much choice about that. IList<Toffers indexed access to the collection, which is sometimes what you need,
but there's no read-only version (i.e. one with a read-only indexer).

Even if you added a custom read-only IReadOnlyList<Tretroactively make existing .NET classes implement it. Wrapping the class
is the only available option. Of course, you can always wrap an existing
class in a custom read-only interface implementation, but then you run
into potential design questions if you then need to pass that
implementation back to external code somewhere (e.g. sure,
IReadOnlyList<Tan attempt to mutate the instance occurred, but then how's that any better
than just using ReadOnlyCollection<T

I do this often, yes. I try to treat my types, including collections, as
the lowest-common-denominator needed. And yes, this provides basic
compile-time protections, but doesn't ensure against clients casting the
type. Note, however, that casting the type is about the same design-wise
as using reflection to get at the implementation details of any class.
It's more efficient, but otherwise violates all the same encapsulation
rules. Clients that do so, do so at their own risk.

Note also that violating that encapsulation rule isn't always bad. For
example, the Enumerable.Count() extension method attempts to cast the
target object to ICollection, and gets the ICollection.Count property
instead of enumerating the entire collection when possible. One could
argue that's a violation of the encapsulation, but personally I'm glad
that performance optimization is there.

I would implement a custom interface in this scenario. You don't want to
inherit Collection<Tkind of inheritence in mind (e.g. the indexer for List<Tvirtual...you can't make a List<T
After all, we've already got ReadOnlyCollection<Trun-time protection. For compile-time protection, an interface
accomplishes the same thing, but without requiring a specific inheritance
hierarchy.

Obviously a custom interface does require a custom implementation. So,
for example, you could define an IReadOnlyList<Tread-only, and then implement the interface in your own collection types.
But, I wouldn't bother with thinking of your custom implementation as
anything other than a read-only wrapper with compile-time information.

Of course, as long as you're happy to pass around a reference actually
_typed_ as ReadOnlyCollection<Tprotection you'd get with an interface, without the bother of actually
defining one. But obviously there may be cases where you'd prefer to deal
with a custom interface instead.

If anything, maybe this option #5 is best implemented by sub-classing
ReadOnlyCollection<Tthat sub-class, implementing your custom IReadOnlyList<TI would only bother with this if you anticipate other implementations of
IReadOnlyList<TReadOnlyCollection<T

Nor should it. If it is an issue, you've got bigger fish to fry. :)

It's a bad idea to down-cast like that, and the people writing the code
should know that even without an explicit policy. Making the policy
explicit should make things even better. (Not that down-casting is always
wrong...sometimes it's unavoidable. Just that it needs to be used very
carefully, and in a situation where it's clearly documented to be safe and
within the contract of the code involved).

Note that your hypothetical situation "If I want to work with a List<Tsort of puts the cart before the horse. That is, sure...creating a new
List<Tpreserves the safety of the original collection. But, it begs the
question as to why it's appropriate to take the data you got from the
IEnumerable<Tmight have to do that, often that suggests that you picked the wrong
lowest-common-denominator in the first place, or that you're handling the
data incorrectly.

Hard to say without specific examples, but any time you find yourself
working around some design contract, the first question should be "is this
really the right thing to do, rather than either following the design
contract, or modifying the design contract to suit the need better?"

It should. Not that being inconsistent is bad. It's that you should at
least be concerned when you're inconsistent, and take some time to
understand why you're doing it. It should never be out of simple
convenience. You should have some clear, direct goal that is achieved
only through the inconsistency and which isn't in conflict with your other
goals.

Pete

On Sun, 24 May 2009 21:46:19 -0700 (PDT), "jeh...@gmail.com" <...@gmail.com

On May 24, 3:06 pm, "Peter Duniho" <...@nnowslpianmk.comwrote:

The only reason I want a List<TThe only reason I want to sort the data is because my data layer
didn't do it for me. The only reason my data layer didn't do it was
because I reuse the same SQL without ORDER BYs because I might want to
sort differently. Perhaps I should overload my data layer methods with
an IComparer<Tto pass sorting criteria from the top of the application down to the
data layer...

But it's like I said, it really hasn't been an issue. And it's like
you said, it really depends on the context.

Which brings up another question... how do you sort by multiple
properties within an object at the same time? Say you want to sort
people by their last names and then their age. This is how I have been
doing it. It works well up until you starting sorting by a large
number of properties.

class Person { string LastName { get;set; } int Age { get;set; } }

int Compare(Person p1, Person p2)
{
int result = Comparer<stringp2.LastName);
if (result == 0)
{
result = Comparer<int }
return result;
}

After about the second property, the if (result == 0) gets a little
mind-numbing. When all of the properties have the same type, I just
stick their values in arrays and loop through instead. That doesn't
seem that great either.

I guess you could rewrite a generic IComparer that used reflection.
I'm still debating whether I'm willing to sacrifice the performance or
the compile time safety... my properties change names and again and
they are reused in several subsystems...

On Sun, 24 May 2009 22:40:24 -0700, "Peter Duniho" <...@nnowslpianmk.com

On Sun, 24 May 2009 21:46:19 -0700, jehu...@gmail.com
<...@gmail.com

I think for comparisons that are not dynamically generated, simply
hard-coding the comparison logic as you demonstrated is fine. Yes, with a
lot of properties it gets redundant, but it's efficient, readable, and
works. :)

For dynamically generated sorting (e.g. based on user selections),
reflection can be okay. Just make sure you cache the PropertyInfo objects
before the sort so you're not performing the same slow reflection
operations over and over.

An alternative approach is to change your property implementation to use
an indexer rather than named properties. Or if you want to retain the
convenience of named properties, use the indexer in addition to (depending
on what's more important, you can have the indexer go through the named
properties, based on a switch -- more efficient, but a little more trouble
to maintain -- or pre-generated reflection-based lookup, or you can have
each property go through the indexer, casting for the convenience of the
consumer of the property).

I wouldn't bother with all that trouble though unless I needed it for
dynamically generated sort criteria.

Pete

On Sun, 24 May 2009 23:40:08 +0200, Ertugrul Söylemez <...@ertes.de

"jeh...@gmail.com" <...@gmail.com

6) Pass each element of the collection to a closure. This appears to be
the most elegant method to me and in many cases also the fastest,
especially for dynamic collections.

Greets,
Ertugrul.

--
nightmare = unsafePerformIO (getWrongWife http://blog.ertes.de/

On Sun, 24 May 2009 21:26:42 -0700 (PDT), "jeh...@gmail.com" <...@gmail.com

On May 24, 3:40 pm, Ertugrul Söylemez <...@ertes.de
Expand. What is a closure?

On Sun, 24 May 2009 22:45:18 -0700, "Peter Duniho" <...@nnowslpianmk.com

On Sun, 24 May 2009 21:26:42 -0700, jehu...@gmail.com
<...@gmail.com

A "closure" is a specific term that in this case describes anonymous
methods in C#. I'm not sure exactly what Ertugrul is describing
specifically, but he might be talking about using an anonymous method as a
delegate for some kind of enumeration method on the collection.

Of course, to some extent that's how one might use IEnumerable<Tknow). But it's true that if you encapsulate the enumeration in the class
containing the collection, allowing client code to pass the delegate to
that class rather than exposing an actual IEnumerable<Tlot of collection enumeration scenarios without ever exposing the actual
collection.

Pete

On Mon, 25 May 2009 13:13:51 +0200, Ertugrul Söylemez <...@ertes.de

jehu...@gmail.com schrieb:

Sorry, since I'm mostly using non-MS languages, I use the more common
term 'closures' for what Microsoft prefers to call "delegates". The
idea is based on the fact that lists don't need to be represented or
even representable in memory. Here is code for a very simple and
comprehensible special case of the idea, which shows how this can save
you a lot of memory and computation time:

// Expose the list [1, 2, 3].
void exposeOneTwoThree(Action<int expose(1);
expose(2);
expose(3);
}

// Expose all real square roots.
void exposeSqrts(double x, Action<double double sqrt;

if (x < 0) return;
sqrt = Math.Sqrt(x);
expose(sqrt);
if (sqrt != 0) expose(-sqrt);
}

// somewhere else:
exposeOneTwoThree(x = exposeSqrts(16, x =

The general idea comes from lambda calculus and states that a list can
be identified by its accompanying folding function. The great advantage
here is that you can do things to the list without ever constructing it
in memory like calculating its sum, finding its maximum or printing its
elements or even doing a mixture of these operations:

// Fold the list [1, 2, 3]
static Y oneTwoThree<Y return fold(fold(fold(baseVal, 1), 2), 3);
}

// Alternative notation to make the inner working more clear:
static Y oneTwoThreeAlt<Y Y current = baseVal;

current = fold(current, 1);
current = fold(current, 2);
current = fold(current, 3);
return current;
}

// Fold all real square roots.
static Y sqrts<Y Y current = baseVal;
double sqrt;

if (x < 0) return current;
sqrt = Math.Sqrt(x);
current = fold(current, sqrt);
if (sqrt != 0) current = fold(current, -sqrt);
return current;
}

// Example: print each element (using bool for Y):
oneTwoThree((_, y) =
// Example: return the sum of the elements of the exposed list:
int sum = oneTwoThree((x,y) =
// Example: print each accumulation step of the product and return it:
int product = oneTwoThree((x,y) = int p = x*y;
Console.WriteLine("{0}", p);
return p;
}, 1);

// Example: get a real List<double List<double sqrts(16, (x,y) =

Note 1: I have not tested the code, but it should work.

Greets,
Ertugrul.

On Wed, 27 May 2009 11:32:21 +0100, "Paul" <...@novareconsulting.com

Hi again mate,

Sorry do not take this the wrong way.

I know you have a lot of stuff to do and other complexities to solve. What
will doing this now actually achieve for you and will it add value to your
application at this stage. I only ask because the statement 'PICK YOUR
BATTLES' comes to mind.


On Wed, 27 May 2009 15:35:44 -0700 (PDT), "jeh...@gmail.com" <...@gmail.com

On May 27, 4:32 am, "Paul" <...@novareconsulting.com
I don't sit on my hands waiting for a lot of my questions to be
answered. I just like to see what other people are doing. Who knows, I
might be missing something obvious. They usually come after I see some
code that gives me a bad feeling and I don't know why.

This email chain helped me make a decision. I was creating List<Tthe data layer of my application and returning it as an
IEnumerable<Tby creating a new List<Tdata layer routines to take IComparer<Tplace there. This email chain convinced me that casting was a no go
and that recreating a list in the business layer wasn't a big deal. I
was debating whether returning List<T> would be such a bad thing, too.