Chugalug Linux Users Group- RE: Big Memory
CHUGALUG
Chattanooga
Unix Gnu
and Linux
User Group
Recent Keywords:
From: Matt Keys ------------------------------------------------------ I built a small vmware host in my newegg wishlist that was a lot of bang for the buck. My notes on it were "16 cores at 2GHz, 24GB DDR3 1066, 2.5TB disk in a 4U for under $2000" 1x rosewill rsv-l4000 4U case $109 1x asus kgpe-d16 dual socket g34 $399 1x corsair 750w $115 2x amd opteron 6128 (8 cores each) $520 1x g.skill 24gb (6x4gb) ddr3 1333 $239 (out of stock now) 5x 500gb sata WD caviar blacks (deactivated now, was around ~$50/ea) 1x artic cooling thermal compound $13 From: chugalug-bounces@chugalug.org [mailto:chugalug-bounces@chugalug.org] On Behalf Of Eric Wolf Sent: Tuesday, June 28, 2011 1:32 PM To: CHUGALUG Subject: Re: [Chugalug] Big Memory I think I'm going to take the next logical/lazy step and write the index to SQLite and let the library do the dirty work for me. I'm spending too much time thinking about this. And yeah, a half TB of RAM seems ridiculous but it's surprisingly doable. You can build a 1/4 TB RAM machine with parts from NewEgg for under $7K. Figure you guys have been talking about building systems with 1000s of processors for Bitcoin mining. Makes sense that RAM would work proportionally as well. We need a "NewEgg Index": What is the phattest machine that can be built from parts in stock at NewEgg? CPU: How many cores? What speed? RAM: TBs? Disk: PBs? GPU: 10K? The motherboard I was looking at could support 48 CPU cores, 256GB RAM but the rest gets harder because you wouldn't put too many drives in a single cabinet (just use NAS) and to get the GPU count up, you are using bus extenders... Thanks for the input... -Eric -=--=---=----=----=---=--=-=--=---=----=---=--=-=- Eric B. Wolf 720-334-7734 On Tue, Jun 28, 2011 at 11:17 AM, Chad Smith wrote: The more I read the more amazed I get... HALF A TERABYTE OF RAM!!!! it's like "1.21 JiggaWatts!!!" (I know it's Gigawatts, but that's not what the man said.) - Chad W Smith "I like a man who's middle name is W." - President George W. Bush - February 10, 2003 bit.ly/gwb-dubya On Tue, Jun 28, 2011 at 12:09 PM, Aaron welch wrote: Hive running on a Cassandra ring would be easier. That gives you an SQL interface over a distributed node cluster with linear performance gains from adding new hosts. http://www.datastax.com/products/brisk -AW On Tue, Jun 28, 2011 at 1:06 PM, Eric Wolf wrote: Like I said, I'm being lazy with the code. Map-Reducing the problem is not lazy. -Eric -=--=---=----=----=---=--=-=--=---=----=---=--=-=- Eric B. Wolf 720-334-7734 On Tue, Jun 28, 2011 at 10:58 AM, Ryan Bales wrote: You don't need big memory if you're able to distribute the load with something like MapReduce. I know GAE supports MapReduce, and I'm sure there are others. GAE also supports WSGI, so you're good to go with python. ~Ryan Bales On Tue, Jun 28, 2011 at 11:20 AM, Eric Wolf wrote: I'm currently trying to work with a really big data file (473GB) with some Python code. I'm building an index in RAM in Python with a set. Currently, I am running out RAM (and VM) on my system with 8GB of RAM and 12GB of VM. I have two options: rewrite the code so it's slower but fits in my available memory or push it out somewhere where I can have the RAM to do the job. The "slower" bit may end up being a deal breaker because I anticipate the jobs to take a couple days even working straight from RAM. "Slower" might mean weeks or months. So I have time to explore finding someplace else to run this. So what I need is a platform that provides a reasonably current Python installation, 512GB of RAM and 2-3TBs of disk. Looking on NewEgg, the biggest system I can build is a 256GB RAM box starting around $6K. I could build a system with 128GB of RAM and use a 512GB SSD for VM for starting around $5K. The money isn't a deal breaker but it still doesn't guarantee I can achieve what I need - hours or days instead of weeks or months. The largest EC2 instance Amazon has only has 68GB of RAM. I'll probably try that next just because it's a cheaper way to get out of my 8GB physical limitation. Cloud is more appealing because I really don't want to have to waste a day or two building a box (in addition to the purchasing headaches). And I may not need the system after this project. Are there any other options out there for large memory cloud systems? -Eric -=--=---=----=----=---=--=-=--=---=----=---=--=-=- Eric B. Wolf 720-334-7734