PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Tuesday, September 20, 2022

[FIXED] What type of Map should I use if the data does not change?

 September 20, 2022     hashmap, java     No comments   

Issue

I have data that I want to lookup by key.

My particular use case is that the data (key/value and number of elements) does not change once the map is initialised. All key/value values are known at once.

I have generally use a HashMap for this with default constructor (default initial capacity and load factor).

What is the best way build this Map? If I was to use HashMap, what should the default initial capacity and load factor be set to? Is Map.copyOf() a better solution? Does the size of the map matter (20 elements vs 140,000)?

This article https://docs.oracle.com/en/java/javase/15/core/creating-immutable-lists-sets-and-maps.html#GUID-6A9BAE41-A1AD-4AA1-AF1A-A8FC99A14199 seems to imply that non mutable Map returned by Map.copyOf() is more space efficient.


Solution

HashMap is fairly close to optimal in most cases already. The array of buckets doubles in capacity each time, so it's most wasteful when you have (2^N) + 1 items, since the capacity will necessarily be 2^(N+1) (i.e. 2049 items require capacity of 4096, but 2048 items fit perfectly).

In your case, specifying an initial size will only prevent a few reallocations when the map is created, which if it only happens once probably isn't relevant. Load factor is not relevant because the map's capacity will never change. In any case, if you did want to pre-size, this would be correct:

new HashMap<>(numItems, 1);

Does the size of the map matter (20 elements vs 140,000)?

It will have an impact, but not a massive one. Items are grouped into buckets, and buckets are structured as lists or trees. So the performance is mostly dependent on how many items are in a given bucket, rather than the total number of items across all buckets.

What's important is how evenly distributed across your buckets the items are. A bad hash code implementation will result in clustering. Clustering will start to move O(1) operations towards O(log n), I believe.

// The worst possible hashCode impl.
@Override
public int hashCode() { return 0; } // or any other constant 

If you have the same items in the map across multiple invocations of your application (not clear from the question if that's the case), and if the class of the key is under your control, then you have the luxury of being able to tweak the hashCode implementation to positively affect the distribution, e.g. by using different prime numbers as a modulus. This would be trial and error, though, and is really only a micro-optimization.

As for the comments/answers addressing how to confer immutability, I'd argue that that's a separate concern. First work out what map is actually optimal, then worry about how to confer immutability upon it, if it isn't already. You can always wrap a mutable map in Collections.unmodifiableMap. Supposedly Guava's ImmutableMap is slower than HashMap, and I suspect other immutable variants will struggle to exceed the performance of HashMap too.



Answered By - Michael
Answer Checked By - Timothy Miller (PHPFixing Admin)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing