[97274 ms] hm{"pageCount"=null, "status_code"="200", "text"="\n			\n			\n		\n\n				\n			\n		\n\n	\n			\n			\n		\n\n	\n\n\n		\n\n\n	\n	\n\n	\n		   \n	   \n	    \n	  \n	    		Subscribe\n	Newsletters\n	Digital Library\n	RSS\n\n	\n							\n		\n\n    	\n		\n	\n		\n\n\n	\n		\n	\n\n\n	Search: \n	 Site\n	 Source Code\n\n	\n\n	\n\n		\n 	\n\n\n\n	Home\n	Articles\n	News\n	Blogs\n	Source Code\n	Dobb's TV\n	Webinars & Events\n\n\n\n\n\n\n\n\n  	 	\n  		\n	\n\n\n  \n		\n\n\n\n \n \n \n\n	\n	        Sections ▼\n					Home\n	Articles\n	News\n	Blogs\n	Source Code\n	Dobb's TV\n	Webinars & Events\n\n	\n\n  \n  \n  \n\n\n\n\n		\n\n\n\n\n\n\n\n	\n	\n	\n			Cloud\n	Mobile\n	Parallel\n	.NET\n	JVM Languages\n	C/C++\n	Tools\n	Design\n	Testing\n	Web Dev\n	Jolt Awards\n\n\n\n\n	  \n\n	\n	\n	        Channels ▼\n					Cloud\n	Mobile\n	Parallel\n	.NET\n	JVM Languages\n	C/C++\n	Tools\n	Design\n	Testing\n	Web Dev\n	Jolt Awards\n\n	\n\n  	\n	\n	\n					\n			\n		\n\n		\n\n	\n			\n		\n    \n    	  	\n			    	\n    	        \n	    	\n        \n            Walter Bright\n            \n            Dr. Dobb's Bloggers\n\n        \n\n    \n\n    			\n    \n     \n        \n            \n                \n                	                    Bio |  Archive \n                \n\n                Walter Bright\n\n            \n\n             \n                \n        	            \n\n\n\n        					  \n            \n\n        \n\n    \n\n\n\n		\n				\n			\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n	 \n	 \n				\n		    Tweet\n	\n        \n						 \n        					 \n		\n				    	 \n			\n		\n        \n			\n		\n\n		         \n        		\n		\n		\n		\n		\n\n		\n		\n		\n		\n		\n\n				\n				\n		\n						 \n						\n						\n			 \n			\n			Permalink\n			\n\n		\n\n		\n		\n				\n	\n\n	\n			\n\n\n\n	\n\n\n\n	\n		Increasing Compiler Speed by Over 75%\n\n		 July 25, 2013\n\n	\n\n\n\n\nD is designed to be a language that is amenable to fast compilation.\n\n    \n\n		    \n		\n	\n	\n\nI began implementing compilers on painfully slow machines like the IBM PC, so having the compiler run fast was always a big deal. It got a lot of attention, and the compilers I wrote were known for being very fast. I naturally built the techniques learned along the way into the DMD D compiler, and D itself is designed to be a language that is amenable to fast compilation.\n\n\n\n\n\n	\nMore Insights\n\n\n\nWhite Papers\n		 \n       \n			\n					\n				Securosis Analyst Report: Security and Privacy on the Encrypted Network				\n								\n			\n					\n				Coding to standards and quality: supply-chain application development				\n								\n\n\nMore >>Reports\n		 \n       \n			\n					\n				Return of the Silos				\n								\n			\n					\n				SaaS and E-Discovery: Navigating Complex Waters				\n								\n\n\nMore >>Webcasts\n		 \n       \n			\n					\n				Asset Management For Electronics Industry				\n								\n			\n					\n				IT and LOB Win When Your Business Adopts Flexible Social Cloud Collaboration Tools				\n								\n\n\nMore >>\n\n\n\nThis has been a big success for D, and many engineers have told me that D's compile speed was a huge factor in being seduced by D. Every big boost in compile speed has a transformative effect on the development process.\n\n\n\nIt's easy to get complacent and focus on other issues with DMD. Subtle  degradations in speed can creep in, especially with a lot of people contributing to the codebase. Compiling the D test suite was taking too much time — making that faster has a lot of leverage in improving productivity. In particular, some D code makes very heavy use of templates, and these were not compiling as fast as I'd like. The library module std.algorithm stood out.\n\n\n\nThe obvious low hanging fruit was that some templates were being instantiated 70,000 times or more. This is a problem well known among C++ compiler guys — given a template and the arguments to instantiate it, look to see if it's already been instantiated and simply point to the existing one. The lookup for this was, I'm ashamed to admit, a linear list. It's like dragging a block of concrete behind your sports car. Replacing this with a hash table netted a mere 10% improvement, I expected more. How disappointing.\n\n\n\nMy next idea revolved around the issue that templates, when instantiated, are given a string identifier that uniquely identifies them. You can see these names if you disassemble any template-using code written in C++ or D. They tend to be long, as they incorporate a \"mangled\" version of each and every template argument.\n\n\n\nThe thing is, an awful lot of templates are generated that never appear in the  generated code, as they are templates that produce a value or a type. These don't need generated identifiers. I switched the identifier generation to being lazy; i.e., only generated if needed by the object code generation. This also produced a perceptible improvement, but much more modest than I'd anticipated.\n\n\n\nTime to stop guessing where the speed problems were, and start instrumenting. Time to trot out gprof, the Gnu profiler. I fired it up on the slow example, and waited. And waited, and waited, and waited. I waited overnight. It pole-axed my Ubuntu box, I had to cold boot it. The test case was so big that it plus gprof was too much. gprof slows things down a lot, so I had to cut things way down to get it to work.\n\n\n\nThe top three time-eating tasks were:\n\n\n\n\n	lexing\n\n	hash table lookups\n\n	storage allocation\n\n\n\n\n\nLexing has always been the succubus leaching the cycles out of a compiler. It's written with that in mind (and so is D's grammar), and I didn't see much more oil to be pressed out of that olive.\n\n\n\nI figured hash table lookups would be high on the list, but not that high. DMD makes heavy use of hash tables internally. The symbol tables are all hash tables, the template instantiations are (now!), etc. Looking at the code, nothing too obvious stuck out:\n\n\n\n    if (aa)\n    {\n        size_t len = aa->b_length;\n        size_t i = (size_t)key % len;\n        aaA* e = aa->b[i];\n        while (e)\n        {\n            if (key == e->key)\n                return e->value; // found\n            e = e->next;\n        }\n    }\n    return NULL;    // not found\n\n\n\n\nBut there is something interesting about it — the key%len operation. Division is a notoriously expensive operation. I could replace it with a mask, but that supposedly leads to a not-so-uniform distribution. A few months back, I implemented in the compiler back end the optimization of replacing an integral divide by constant with a multiply by the reciprocal. [4] It's a lot faster, but here the denominator is a variable, and so I can't use that optimization.\n\n\n\nOr can I? The denominator is a prime number pulled out of a table of primes:\n\n\n\n    static const size_t prime_list[] = {\n              31UL,\n              97UL,            389UL,\n            1543UL,          6151UL,\n            24593UL,          98317UL,\n          393241UL,        1572869UL,\n          6291469UL,      25165843UL,\n        100663319UL,      402653189UL,\n      1610612741UL,    4294967291UL,\n    };\n\n\n\n\nIf the hash table chains get too long, it is rehashed using a larger prime from this table. But most of the time, the symbol tables are small (every scope has its own symbol table). So I rewrote the key%len as:\n\n\n\n    size_t i;\n    size_t len = aa->b_length;\n    if (len == 4)\n        i = (size_t)key & 3;\n    else if (len == 31)\n        i = (size_t)key % 31;\n    else\n        i = (size_t)key % len;\n\n\n\n\nThe smallest bucket has a length of 4, the next larger size is 31. So, for the most common cases, the compiler is able to generate the fast modulo code because it's dividing by a known constant.\n\n\n\nThis change alone was worth a 5% speed boost. Not bad for 3 or 4 lines of code!\n\n\n\nThat left storage allocation.\n\n\n\nStorage allocation is one of the great unsolved problems in programming. You can do manual allocation at the expense of endless pointer bugs. You can do reference counting with its performance problems and bloat. You can do garbage collection with its pausing and excessive memory consumption.\n\n\n\nDMD does memory allocation in a bit of a sneaky way. Since compilers are short-lived programs, and speed is of the essence, DMD just mallocs away, and never frees. This eliminates the scaffolding and complexity of figuring out who owns the memory and when it should be released. (It has the downside of consuming all the resources of your machine if the module being compiled is big enough.)\n\n\n\nBut malloc() itself is designed with the presumption that the memory allocated will eventually get freed. Since it's not in DMD, I tried replacing the storage allocator with a dead simple bump-the-pointer greedy allocator:\n\n\n\n    #define CHUNK_SIZE (4096 * 16)\n\n    static size_t heapleft = 0;\n    static void *heapp;\n\n    void * operator new(size_t m_size) {\n        // 16 byte alignment is better\n        // (and sometimes needed) for doubles\n        m_size = (m_size + 15) & ~15;\n\n        // The layout of the code is selected so the\n        // most common case is straight through\n        if (m_size <= heapleft) {\n          L1:\n            heapleft -= m_size;\n            void *p = heapp;\n            heapp = (void *)((char *)heapp + m_size);\n            return p;\n        }\n\n        if (m_size > CHUNK_SIZE) {\n            void *p = malloc(m_size);\n            if (p)\n                return p;\n            printf(\"Error: out of memory\\n\");\n            exit(EXIT_FAILURE);\n            return p;\n        }\n\n        heapleft = CHUNK_SIZE;\n        heapp = malloc(CHUNK_SIZE);\n        if (!heapp) {\n            printf(\"Error: out of memory\\n\");\n            exit(EXIT_FAILURE);\n        }\n        goto L1;\n    }\n\n    void operator delete(void *p) { }\n\n\n\n\nIt's like I poured nitrous down the compiler's throat. Along with the other improvements, this achieved a near doubling of the compiler speed. (It even reduced the memory consumption somewhat, apparently because malloc() needed memory for its data structures.)\n\n\n\nThe benchmark used was compiling Phobos' std.algorithm for unit tests, which is:\n\n\n\n    dmd std\\algorithm -unittest –main\n\n\n\n\ndmd 2.063 does the nasty in 21.56 seconds, and the latest development version does it in 12.19. This was done on Windows. There's a little bit of apples versus oranges here, because the latest Phobos is larger than the 2.063 one.\n\n\n\n(DMD 2.064 with this improvement hasn't been released yet, but of course you can try it out from github.)\n\n\n\n\nConclusion\n\n\n\nEven if you know your code well, you're likely wrong about where the performance bottlenecks are. Use a profiler. If you haven't used one on your codebase in a while, it's highly likely there's a bottleneck in there that's fixable with just a few lines of code.\n\n\n\nReferences\n\n\n\n\n	DMD source code\n\n	hash tables\n\n	storage allocator\n\n	Torbjörn Granlund and Peter Montgomery, Division by Invariant Integers using Multiplication\n\n	std.algorithm source code\n\n\n\nAcknowledgments\n\n\n\nThanks to Jason House and Andrei Alexandrescu for their helpful comments on this article.\n\n	\n	\n\n	\n\n\n\n\n\n\n\n	\n\n\n\n\n\n\n\n\n\n\n\n\n \n\n		\n		\n		\n		\n	             \n     \n      \n    Related Reading\n\n    \n        	News\n	Commentary\n\n\n        \n            News\n\n			    	Tools To Build Payment-Enabled Mobile Apps\n	Did Barcode Reading Just Get Interesting?\n	Restlet Completes \"Complete\" API Platform\n	Boost.org Committee Battles Library Log-Jam\n\nMore News»                        \n                        \n        \n\n        \n            Commentary\n\n			    	Did Barcode Reading Just Get Interesting?\n	Jolt Awards 2015: Coding Tools\n	Thriving Among the APIs\n	Jelastic Docker Integration For Orchestrated Delivery\n\nMore Commentary»                        \n            \n\n    \n     \n      \n    \n        	Slideshow\n	Video\n\n\n        \n            Slideshow\n\n			    	Jolt Awards: The Best Books\n	Jolt Awards 2014: The Best Testing Tools\n	2014 Developer Salary Survey\n	Jolt Awards: The Best Books\n\nMore Slideshows»                        \n            \n        \n        \n            Video\n\n			    	Intel at Mobile World Congress\n	FIRST Robotics Competition Gears Up\n	Teen Computer Scientist Wins Big at ISEF\n	Amazon Connection: Broadband in the Rainforest\n\nMore Videos»                        \n            \n\n    \n \n    \n    \n        	Most Popular\n\n\n        \n            Most Popular\n\n			    	State Machine Design in C++\n	Building Scalable Web Architecture and Distributed Systems\n	Jolt Awards 2015: Coding Tools\n	Finding the Median of Two Sorted Arrays Efficiently\n\nMore Popular»                        \n            \n        \n    \n  \n    \n\n    \n    \n\n\n			\n			\n 			\n				\n					INFO-LINK\n	\n				\n			\n		\n\n	\n	\n				\n			\n		\n\n	\n	\n				\n			\n		\n\n	\n	\n				\n			\n		\n\n	\n\n\n\n\n\n\n				\n\n				\n\n				\n\n				\n							\n			\n		\n\n					 	\n 			\n\n 		\n\n 		\n		\n		\n		\n  \n\n\n\n\n\n\n\n\n	\n    	\n    		\n    	\n\n    	\n\n \n    	\n      		Currently we allow the following HTML tags in comments:\n\n			Single tags\n\n			These tags can be used alone and don't need an ending tag. \n\n			<br> Defines a single line break \n\n \n			<hr> Defines a horizontal line\n\n\n			Matching tags\n\n			These require an ending tag - e.g. <i>italic text</i> \n\n			<a> Defines an anchor\n\n\n			<b> Defines bold text\n\n\n			<big> Defines big text\n\n\n\n			<blockquote> Defines a long quotation\n\n\n			<caption> Defines a table caption\n\n\n			<cite> Defines a citation\n\n\n			<code> Defines computer code text\n\n\n			<em> Defines emphasized text\n\n\n\n			<fieldset> Defines a border around elements in a form\n\n\n			<h1> This is heading 1\n\n\n			<h2> This is heading 2\n\n\n			<h3> This is heading 3\n\n\n			<h4> This is heading 4\n\n\n			\n			<h5> This is heading 5\n\n\n			<h6> This is heading 6\n\n\n			<i> Defines italic text\n\n\n			<p> Defines a paragraph\n\n\n			<pre> Defines preformatted text\n\n\n			\n			<q> Defines a short quotation\n\n\n			<samp> Defines sample computer code text\n\n\n			<small> Defines small text\n\n\n			<span> Defines a section in a document\n\n\n			<s> Defines strikethrough text\n\n\n			\n			<strike> Defines strikethrough text \n\n\n			<strong> Defines strong text \n\n\n			<sub> Defines subscripted text \n\n\n			<sup> Defines superscripted text \n\n\n			<u> Defines underlined text\n\n\n\n    	\n  \n	\n\n\n	\n		  Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task.  \n		  However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.\n\n	\n	\n\n\n	 \n		\n			To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.\n			\n		\n	 \n\n  \n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n		\n\n\n\n\n\n		\n \n		\n			\n\n\n\n\n\n\n\n\n\n\n		\n\n		\n			\n 						\n		 \n				\n\n\n			\n			\n				\n	C/C++ Recent Articles\n\n		Dr. Dobb's Archive\n	Jolt Awards 2015: Coding Tools\n	Building Node.js Projects in Visual Studio\n	Building Portable Games in C++\n	C# and .NET's Sudden Ubiquity\n	\n	\n\n			\n\n			\n						\n			\n				Most Popular\n\n\n\n	Stories\n	Blogs\n\n\n\n		\n					The C++14 Standard: What You Need to Know\n					\n	\n					A Simple and Efficient FFT Implementation in C++:\n Part I\n					\n	\n					State Machine Design in C++\n					\n	\n					Lambdas in C++11\n					\n	\n					A Lightweight Logger for C++\n					\n	\n					\n					\n	\n					\n					\n\n\n\n\n\n		\n					C++11: unique_ptr\n					\n	\n					C++11's async Template\n					\n	\n					Abstractions For Binary Search, Part 10: Putting It All Together\n					\n	\n					It's Hard To Compare Floating-Point Numbers\n					\n	\n					Auto Types and Range-Based For Statements in C++11\n					\n	\n					\n					\n	\n					\n					\n\n\n\n			\n\n			\n			\n							\n\n			\n			\n				\n\n	This month's Dr. Dobb's Journal\n\n	\n		\n			\n				\n					\n				\n			\n\n			\n				This month, \nDr. Dobb's Journal is devoted to mobile programming. We introduce you to Apple's new Swift programming language, discuss the perils of being the third-most-popular mobile platform, revisit SQLite on Android		\n				, and much more!\n\n\n				Download the latest issue today. >> \n			\n\n		\n	\n	\n				 \n			\n\n			\n			\n			\n			\n		\n\n		\n			\n\n\n\n   \n\n\n    Upcoming Events\n\n    \n    \n    \n       \n         Live Events\n         WebCasts\n                 \n       \n\n              \n       \n           		 \n     	  	Interop ITX: The Independent Conference For Tech Leaders (April 30 - May 4 In Las Vegas) - InteropITX 2018\n	                 \n	Enterprise Connect Conference and Expo: March 12-15 - Enterprise Connect Orlando\n	                 \n	Speech Technologies - New Track at Enterprise Connect! - Enterprise Connect Orlando\n	                 \n	Transform Business Processes with API-enabled Integrations - Enterprise Connect Orlando\n	                 \n\n\n  \n \n   \n			\n       \n\n      \n                        \n         		 \n      \n                        		Step Up Your Game in Loan Operations in 2014                        	\n                            \n	New Technologies to Optimize Mobile Financial Services                        	\n                            \n	Accelerate Cloud Computing Success with Open Standards                        	\n                            \n	Defense Against the Dark Arts                        	\n                            \n	Developing a User-Centric Secure Mobile Strategy: It's in Reach                        	\n                            \n\n\n	   \n		   \n\n           		 More Webcasts>>\n\n     	   \n        \n		       \n       \n       \n                \n    \n    \n\n\n\n\n\n\n\n\n\n	Featured Reports\n\n	 \n			 \n\n			What's this?\n\n		\n\n	\n				 \n       \n			\n					\n				Cloud Collaboration Tools: Big Hopes, Big Needs				\n								\n			\n					\n				Return of the Silos				\n								\n			\n					\n				State of Cloud 2011: Time for Process Maturation				\n								\n			\n					\n				SaaS and E-Discovery: Navigating Complex Waters				\n								\n			\n					\n				SaaS 2011: Adoption Soars, Yet Deployment Concerns Linger				\n								\n\n\n		More >>\n\n		\n	\n	\n\n\n\n\n\n\n	Featured Whitepapers\n\n	 \n			 \n\n			What's this?\n\n		\n\n	\n				 \n       \n			\n					\n				Driving Your Cloud Strategy with Private Network Solutions				\n								\n			\n					\n				Securosis Analyst Report: Security and Privacy on the Encrypted Network				\n								\n			\n					\n				Vulnerability Threat Management in 2015				\n								\n			\n					\n				Case Study: Gilt				\n								\n			\n					\n				Book Expert: Advanced Analytics with Spark: Patterns for Learning Data at Scale				\n								\n\n\n		More >>\n\n		\n	\n	\n\n\n\n\n\n\n\n \n			\n				Most Recent Premium Content\n\n\n	Digital Issues\n\n	\n\n\n2014\n\nDr. Dobb's Journal\n	November - Mobile Development\n	August - Web Development\n	May - Testing\n	February - Languages\n\n\n\n\nDr. Dobb's Tech Digest\n\n	DevOps\n	Open Source\n	Windows and .NET programming\n	The Design of Messaging Middleware and 10 Tips from Tech Writers\n	Parallel Array Operations in Java 8 and Android on x86: Java Native Interface and the Android Native Development Kit\n\n\n\n\n2013\n	January - Mobile Development\n	February - Parallel Programming\n	March - Windows Programming\n	April - Programming Languages\n	May - Web Development\n	June - Database Development\n	July - Testing\n	August - Debugging and Defect Management\n	September - Version Control\n	October - DevOps\n	November- Really Big Data\n	December - Design\n\n\n\n\n2012\n	January - C & C++\n	February - Parallel Programming\n	March - Microsoft Technologies\n	April - Mobile Development\n	May - Database Programming\n	June - Web Development\n	July - Security\n	August - ALM & Development Tools\n	September - Cloud & Web Development\n	October - JVM Languages\n	November - Testing\n	December - DevOps\n\n\n	\n	\n2011\n\n\n			\n			\n\n		\n\n		\n		\n		\n			\n\n\n\n\n\n\n\n  \n    \n    \n    \n    \n    \n\n    \n    \n      \n        	TECHNOLOGY GROUP\n	Black Hat\n	Content Marketing Institute\n	Content Marketing World\n	Dark Reading\n\n\n        	Enterprise Connect\n	GDC\n	Gamasutra\n	HDI\n\n\n        	ICMI\n	InformationWeek\n	INsecurity\n	Interop ITX\n\n\n        	Network Computing\n	No Jitter\n	Service Management World\n	VRDC\n\n\n      \n\n      \n        	COMMUNITIES SERVED\n	Content Marketing\n	Enterprise IT\n	Enterprise Communications\n	Game Developers\n	Information Security\n	IT Services & Support\n\n\n      \n\n      \n        	WORKING WITH US\n	Advertising Contacts\n	Event Calendar\n	Tech Marketing\n	Solutions\n	Contact Us\n	Licensing\n\n\n      \n\n    \n\n  \n\n    \n\n  \n    	Terms of Service\n	Privacy Statement\n	Legal Entities\n	Copyright © 2018 UBM, All rights reserved\n\n\n  \n\n  \n\n\n\n\n\n\n\n\n\n\n      \n	\n	  \n      	Dr. Dobb's Home\n	Articles\n	News\n	Blogs\n	Source Code\n	Dobb's TV\n	Webinars & Events\n\n\n	  \n	\n\n	  \n	\n	  \n      	About Us\n	Contact Us\n	Site Map\n	Editorial Calendar\n\n\n	  \n	\n\n  \n\n\n\n\n \n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n		\n\n		\n	\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n", "title"="Increasing Compiler Speed by Over 75% | Dr Dobb's"}